diff mbox series

[v2,4/4] drm/panfrost: Handle non-aligned lock addresses

Message ID 20210824173028.7528-5-alyssa.rosenzweig@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm/panfrost: Bug fixes for lock_region | expand

Commit Message

Alyssa Rosenzweig Aug. 24, 2021, 5:30 p.m. UTC
When locking memory, the base address is rounded down to the nearest
page. The current code does not adjust the size in this case,
truncating the lock region:

	Input:      [----size----]
	Round: [----size----]

To fix the truncation, extend the lock region by the amount rounded off.

	Input:      [----size----]
	Round: [-------size------]

This bug is difficult to hit under current assumptions: since the size
of the lock region is stored as a ceil(log2), the truncation must cause
us to cross a power-of-two boundary. This is possible, for example if
the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
last 15 bytes.

In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
the bug. Therefore this doesn't need to be backported. Still, that's a
happy accident and not a precondition of lock_region, so we let's do the
right thing to future proof.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Steven Price Aug. 25, 2021, 9:04 a.m. UTC | #1
On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> When locking memory, the base address is rounded down to the nearest
> page. The current code does not adjust the size in this case,
> truncating the lock region:
> 
> 	Input:      [----size----]
> 	Round: [----size----]
> 
> To fix the truncation, extend the lock region by the amount rounded off.
> 
> 	Input:      [----size----]
> 	Round: [-------size------]
> 
> This bug is difficult to hit under current assumptions: since the size
> of the lock region is stored as a ceil(log2), the truncation must cause
> us to cross a power-of-two boundary. This is possible, for example if
> the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
> existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
> region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
> last 15 bytes.
> 
> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> the bug. Therefore this doesn't need to be backported. Still, that's a
> happy accident and not a precondition of lock_region, so we let's do the
> right thing to future proof.

Actually it's worse than that due to the hardware behaviour, the spec
states (for LOCKADDR_BASE):

> Only the upper bits of the address are used. The address is aligned to a
> multiple of the region size, so a variable number of low-order bits are
> ignored, depending on the selected region size. It is recommended that software
> ensures that these low bits in the address are cleared, to avoid confusion.

It appears that indeed this has caused confusion ;)

So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
= 0x30000) the region width gets rounded up (to 0x40000) which causes
the start address to be effectively rounded down (by the hardware) to
0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.

Interestingly (unless my reading of this is wrong) that means to lock
0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
*at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).

This appears to be broken in kbase (which actually does zero out the low
bits of the address) - I've raised a bug internally so hopefully someone
will tell me if I've read the spec completely wrong here.

Steve

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Reported-by: Steven Price <steven.price@arm.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index dfe5f1d29763..14be32497ec3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>  	u8 region_width;
>  	u64 region = iova & PAGE_MASK;
>  
> +	/* After rounding the address down, extend the size to lock the end. */
> +	size += (region - iova);
> +
>  	/* The size is encoded as ceil(log2) minus(1), which may be calculated
>  	 * with fls. The size must be clamped to hardware bounds.
>  	 */
>
Alyssa Rosenzweig Aug. 25, 2021, 2:07 p.m. UTC | #2
> > In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> > the bug. Therefore this doesn't need to be backported. Still, that's a
> > happy accident and not a precondition of lock_region, so we let's do the
> > right thing to future proof.
> 
> Actually it's worse than that due to the hardware behaviour, the spec
> states (for LOCKADDR_BASE):
> 
> > Only the upper bits of the address are used. The address is aligned to a
> > multiple of the region size, so a variable number of low-order bits are
> > ignored, depending on the selected region size. It is recommended that software
> > ensures that these low bits in the address are cleared, to avoid confusion.
> 
> It appears that indeed this has caused confusion ;)
> 
> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
> = 0x30000) the region width gets rounded up (to 0x40000) which causes
> the start address to be effectively rounded down (by the hardware) to
> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
> 
> Interestingly (unless my reading of this is wrong) that means to lock
> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
> 
> This appears to be broken in kbase (which actually does zero out the low
> bits of the address) - I've raised a bug internally so hopefully someone
> will tell me if I've read the spec completely wrong here.

Horrifying, and not what I wanted to read my last day before 2 weeks of
leave. Let's drop this patch, hopefully by the time I'm back, your
friends in GPU can confirm that's a spec bug and not an actual
hardware/driver one...

Can you apply the other 3 patches in the mean time? Thanks :-)
Steven Price Aug. 25, 2021, 2:34 p.m. UTC | #3
On 25/08/2021 15:07, Alyssa Rosenzweig wrote:
>>> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
>>> the bug. Therefore this doesn't need to be backported. Still, that's a
>>> happy accident and not a precondition of lock_region, so we let's do the
>>> right thing to future proof.
>>
>> Actually it's worse than that due to the hardware behaviour, the spec
>> states (for LOCKADDR_BASE):
>>
>>> Only the upper bits of the address are used. The address is aligned to a
>>> multiple of the region size, so a variable number of low-order bits are
>>> ignored, depending on the selected region size. It is recommended that software
>>> ensures that these low bits in the address are cleared, to avoid confusion.
>>
>> It appears that indeed this has caused confusion ;)
>>
>> So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
>> = 0x30000) the region width gets rounded up (to 0x40000) which causes
>> the start address to be effectively rounded down (by the hardware) to
>> 0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.
>>
>> Interestingly (unless my reading of this is wrong) that means to lock
>> 0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
>> *at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).
>>
>> This appears to be broken in kbase (which actually does zero out the low
>> bits of the address) - I've raised a bug internally so hopefully someone
>> will tell me if I've read the spec completely wrong here.
> 
> Horrifying, and not what I wanted to read my last day before 2 weeks of
> leave. Let's drop this patch, hopefully by the time I'm back, your
> friends in GPU can confirm that's a spec bug and not an actual
> hardware/driver one...
> 
> Can you apply the other 3 patches in the mean time? Thanks :-)
> 

Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
v5.15).

It's interesting that if my (new) reading of the spec is correct then
kbase has been horribly broken in this respect forever. So clearly it
can't be something that crops up very often. It would have been good if
the spec could have included wording such as "naturally aligned" if
that's what was intended.

Enjoy your holiday!

Thanks,
Steve
Alyssa Rosenzweig Aug. 25, 2021, 3:05 p.m. UTC | #4
> > Horrifying, and not what I wanted to read my last day before 2 weeks of
> > leave. Let's drop this patch, hopefully by the time I'm back, your
> > friends in GPU can confirm that's a spec bug and not an actual
> > hardware/driver one...
> > 
> > Can you apply the other 3 patches in the mean time? Thanks :-)
> > 
> 
> Yeah, sure. I'll push the first 3 to drm-misc-next-fixes (should land in
> v5.15).
> 
> It's interesting that if my (new) reading of the spec is correct then
> kbase has been horribly broken in this respect forever. So clearly it
> can't be something that crops up very often. It would have been good if
> the spec could have included wording such as "naturally aligned" if
> that's what was intended.

Indeed. Fingers crossed this is a mix-up. Although the text you quoted
seems pretty clear unfortunately :|

> Enjoy your holiday!

Thanks!
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index dfe5f1d29763..14be32497ec3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -63,6 +63,9 @@  static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
 	u8 region_width;
 	u64 region = iova & PAGE_MASK;
 
+	/* After rounding the address down, extend the size to lock the end. */
+	size += (region - iova);
+
 	/* The size is encoded as ceil(log2) minus(1), which may be calculated
 	 * with fls. The size must be clamped to hardware bounds.
 	 */