[2/4] drm/xe: store bind time pat index to xe_bo

Message ID	20240118152745.162960-3-juhapekka.heikkila@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> To: intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org Subject: [PATCH 2/4] drm/xe: store bind time pat index to xe_bo Date: Thu, 18 Jan 2024 17:27:43 +0200 Message-Id: <20240118152745.162960-3-juhapekka.heikkila@gmail.com> In-Reply-To: <20240118152745.162960-1-juhapekka.heikkila@gmail.com> References: <20240118152745.162960-1-juhapekka.heikkila@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	Enable ccs compressed framebuffers on Xe2 \| expand [0/4] Enable ccs compressed framebuffers on Xe2 [1/4] drm/xe: add bind time pat index to xe_bo structure [2/4] drm/xe: store bind time pat index to xe_bo [3/4] drm/xe/xe2: Limit ccs framebuffers to tile4 only [4/4] drm/i915/display: On Xe2 always enable decompression with tile4

Message ID

20240118152745.162960-3-juhapekka.heikkila@gmail.com (mailing list archive)

State

New, archived

Headers

From: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
To: intel-xe@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Subject: [PATCH 2/4] drm/xe: store bind time pat index to xe_bo
Date: Thu, 18 Jan 2024 17:27:43 +0200
Message-Id: <20240118152745.162960-3-juhapekka.heikkila@gmail.com>
In-Reply-To: <20240118152745.162960-1-juhapekka.heikkila@gmail.com>
References: <20240118152745.162960-1-juhapekka.heikkila@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

Enable ccs compressed framebuffers on Xe2 | expand

Comments

Matthew Auld Jan. 19, 2024, 3:45 p.m. UTC | #1

On 18/01/2024 15:27, Juha-Pekka Heikkila wrote:
> Store pat index from xe_vma to xe_bo
> 
> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
> ---
>   drivers/gpu/drm/xe/xe_pt.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index de1030a47588..4b76db698878 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -1252,6 +1252,10 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
>   		return ERR_PTR(-ENOMEM);
>   	}
>   
> +	if (xe_vma_bo(vma)) {
> +		xe_vma_bo(vma)->pat_index = vma->pat_index;

Multiple mappings will trash this I think. Is that OK for your usecase? 
It can be useful to map the same resource as compressed and uncompressed 
to facilitate in-place decompression/compression.

Also would be good to be clear about what happens if the KMD doesn't do 
anything to prevent compression with non-tile4? Is it just a bit of 
display corruption or something much worse that we need to prevent? Is 
this just a best effort check to help userspace? Otherwise it is hard to 
evaluate how solid we need to be here in our checking to prevent this 
scenario. For example how is binding vs display races handled? What 
happens if the bind appears after the display check?

> +	}
> +
>   	fence = xe_migrate_update_pgtables(tile->migrate,
>   					   vm, xe_vma_bo(vma), q,
>   					   entries, num_entries,

Juha-Pekka Heikkila Jan. 22, 2024, 6:26 p.m. UTC | #2

Hi Matthew, thanks for looking into these. Below few thoughts.

On 19.1.2024 17.45, Matthew Auld wrote:
> On 18/01/2024 15:27, Juha-Pekka Heikkila wrote:
>> Store pat index from xe_vma to xe_bo
>>
>> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
>> ---
>>   drivers/gpu/drm/xe/xe_pt.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>> index de1030a47588..4b76db698878 100644
>> --- a/drivers/gpu/drm/xe/xe_pt.c
>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>> @@ -1252,6 +1252,10 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct 
>> xe_vma *vma, struct xe_exec_queue
>>           return ERR_PTR(-ENOMEM);
>>       }
>> +    if (xe_vma_bo(vma)) {
>> +        xe_vma_bo(vma)->pat_index = vma->pat_index;
> 
> Multiple mappings will trash this I think. Is that OK for your usecase? 
> It can be useful to map the same resource as compressed and uncompressed 
> to facilitate in-place decompression/compression.

On i915 I think we did map framebuffers only once and did stay with it 
until fb was destroyed. XE_BO_SCANOUT_BIT is for buffers that are meant 
to be framebuffers? I could make it so pat index given first is not 
allowed to change for buffers with this bit set?

> 
> Also would be good to be clear about what happens if the KMD doesn't do 
> anything to prevent compression with non-tile4? Is it just a bit of 
> display corruption or something much worse that we need to prevent? Is 
> this just a best effort check to help userspace? Otherwise it is hard to 
> evaluate how solid we need to be here in our checking to prevent this 
> scenario. For example how is binding vs display races handled? What 
> happens if the bind appears after the display check?

For what happen with incorrect buffers going for display I've seen they 
are corrupted on screen but my testing is very minimal. On bspec 67158 
it just said linear and tile X formats are not supported with 
decompression on display, so it is broken config. Couldn't say generally 
how robust display hw is for broken configs. I remember Ville had found 
with TGL broken configs caused unrecoverable issues which followed ccs 
getting blocked on some steppings because it was only way to block 
broken config Ville found. I'll add Ville here on cc if he has views on 
this what's needed here for Xe2.

/Juha-Pekka

> 
>> +    }
>> +
>>       fence = xe_migrate_update_pgtables(tile->migrate,
>>                          vm, xe_vma_bo(vma), q,
>>                          entries, num_entries,

Ville Syrjala Jan. 23, 2024, 8:05 a.m. UTC | #3

On Fri, Jan 19, 2024 at 03:45:22PM +0000, Matthew Auld wrote:
> On 18/01/2024 15:27, Juha-Pekka Heikkila wrote:
> > Store pat index from xe_vma to xe_bo
> > 
> > Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
> > ---
> >   drivers/gpu/drm/xe/xe_pt.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> > index de1030a47588..4b76db698878 100644
> > --- a/drivers/gpu/drm/xe/xe_pt.c
> > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > @@ -1252,6 +1252,10 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
> >   		return ERR_PTR(-ENOMEM);
> >   	}
> >   
> > +	if (xe_vma_bo(vma)) {
> > +		xe_vma_bo(vma)->pat_index = vma->pat_index;
> 
> Multiple mappings will trash this I think. Is that OK for your usecase? 
> It can be useful to map the same resource as compressed and uncompressed 
> to facilitate in-place decompression/compression.

I thought the pat_index is set for the entire bo? The
cache_level->pat_index stuff doesn't really work otherwise
I don't think (assuming it works at all).

So dunno why this is doing anything using vmas. I think
what we probably need is to check/set the bo pat_index
at fb create time, and lock it into place (if there's
some mechanism by which a random userspace client could
change it after the fact, and thus screw up everything).

> 
> Also would be good to be clear about what happens if the KMD doesn't do 
> anything to prevent compression with non-tile4? Is it just a bit of 
> display corruption or something much worse that we need to prevent? Is 
> this just a best effort check to help userspace? Otherwise it is hard to 
> evaluate how solid we need to be here in our checking to prevent this 
> scenario. For example how is binding vs display races handled? What 
> happens if the bind appears after the display check?
> 
> > +	}
> > +
> >   	fence = xe_migrate_update_pgtables(tile->migrate,
> >   					   vm, xe_vma_bo(vma), q,
> >   					   entries, num_entries,

Matthew Auld Jan. 23, 2024, 9:17 a.m. UTC | #4

On 23/01/2024 08:05, Ville Syrjälä wrote:
> On Fri, Jan 19, 2024 at 03:45:22PM +0000, Matthew Auld wrote:
>> On 18/01/2024 15:27, Juha-Pekka Heikkila wrote:
>>> Store pat index from xe_vma to xe_bo
>>>
>>> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
>>> ---
>>>    drivers/gpu/drm/xe/xe_pt.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>>> index de1030a47588..4b76db698878 100644
>>> --- a/drivers/gpu/drm/xe/xe_pt.c
>>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>>> @@ -1252,6 +1252,10 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
>>>    		return ERR_PTR(-ENOMEM);
>>>    	}
>>>    
>>> +	if (xe_vma_bo(vma)) {
>>> +		xe_vma_bo(vma)->pat_index = vma->pat_index;
>>
>> Multiple mappings will trash this I think. Is that OK for your usecase?
>> It can be useful to map the same resource as compressed and uncompressed
>> to facilitate in-place decompression/compression.
> 
> I thought the pat_index is set for the entire bo? The
> cache_level->pat_index stuff doesn't really work otherwise
> I don't think (assuming it works at all).

AFAIK it is mostly like that in i915 because it doesn't have a vm_bind 
interface. With Xe we have vm_bind. The pat_index is a property of the 
ppGTT binding and therefore vma. There seem to be legitimate reasons to 
map the same resource with different pat_index, like with 
compressed/uncompressed. See BSpec: 58797 "double map (alias) surfaces".

> 
> So dunno why this is doing anything using vmas. I think
> what we probably need is to check/set the bo pat_index
> at fb create time, and lock it into place (if there's
> some mechanism by which a random userspace client could
> change it after the fact, and thus screw up everything).

Maybe we can seal the pat_index on first bind or something if the BO 
underneath is marked with XE_BO_SCANOUT?

> 
>>
>> Also would be good to be clear about what happens if the KMD doesn't do
>> anything to prevent compression with non-tile4? Is it just a bit of
>> display corruption or something much worse that we need to prevent? Is
>> this just a best effort check to help userspace? Otherwise it is hard to
>> evaluate how solid we need to be here in our checking to prevent this
>> scenario. For example how is binding vs display races handled? What
>> happens if the bind appears after the display check?
>>
>>> +	}
>>> +
>>>    	fence = xe_migrate_update_pgtables(tile->migrate,
>>>    					   vm, xe_vma_bo(vma), q,
>>>    					   entries, num_entries,
>

Matthew Auld Jan. 23, 2024, 12:59 p.m. UTC | #5

On 22/01/2024 18:26, Juha-Pekka Heikkila wrote:
> Hi Matthew, thanks for looking into these. Below few thoughts.
> 
> On 19.1.2024 17.45, Matthew Auld wrote:
>> On 18/01/2024 15:27, Juha-Pekka Heikkila wrote:
>>> Store pat index from xe_vma to xe_bo
>>>
>>> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_pt.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>>> index de1030a47588..4b76db698878 100644
>>> --- a/drivers/gpu/drm/xe/xe_pt.c
>>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>>> @@ -1252,6 +1252,10 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct 
>>> xe_vma *vma, struct xe_exec_queue
>>>           return ERR_PTR(-ENOMEM);
>>>       }
>>> +    if (xe_vma_bo(vma)) {
>>> +        xe_vma_bo(vma)->pat_index = vma->pat_index;
>>
>> Multiple mappings will trash this I think. Is that OK for your 
>> usecase? It can be useful to map the same resource as compressed and 
>> uncompressed to facilitate in-place decompression/compression.
> 
> On i915 I think we did map framebuffers only once and did stay with it 
> until fb was destroyed. XE_BO_SCANOUT_BIT is for buffers that are meant 
> to be framebuffers? I could make it so pat index given first is not 
> allowed to change for buffers with this bit set?

Yeah, sealing the pat_index for such objects might be the simplest option.

> 
>>
>> Also would be good to be clear about what happens if the KMD doesn't 
>> do anything to prevent compression with non-tile4? Is it just a bit of 
>> display corruption or something much worse that we need to prevent? Is 
>> this just a best effort check to help userspace? Otherwise it is hard 
>> to evaluate how solid we need to be here in our checking to prevent 
>> this scenario. For example how is binding vs display races handled? 
>> What happens if the bind appears after the display check?
> 
> For what happen with incorrect buffers going for display I've seen they 
> are corrupted on screen but my testing is very minimal. On bspec 67158 
> it just said linear and tile X formats are not supported with 
> decompression on display, so it is broken config. Couldn't say generally 
> how robust display hw is for broken configs. I remember Ville had found 
> with TGL broken configs caused unrecoverable issues which followed ccs 
> getting blocked on some steppings because it was only way to block 
> broken config Ville found. I'll add Ville here on cc if he has views on 
> this what's needed here for Xe2.
> 
> /Juha-Pekka
> 
>>
>>> +    }
>>> +
>>>       fence = xe_migrate_update_pgtables(tile->migrate,
>>>                          vm, xe_vma_bo(vma), q,
>>>                          entries, num_entries,
>

diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
index de1030a47588..4b76db698878 100644
--- a/drivers/gpu/drm/xe/xe_pt.c
+++ b/drivers/gpu/drm/xe/xe_pt.c
@@ -1252,6 +1252,10 @@  __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue
 		return ERR_PTR(-ENOMEM);
 	}
 
+	if (xe_vma_bo(vma)) {
+		xe_vma_bo(vma)->pat_index = vma->pat_index;
+	}
+
 	fence = xe_migrate_update_pgtables(tile->migrate,
 					   vm, xe_vma_bo(vma), q,
 					   entries, num_entries,

[2/4] drm/xe: store bind time pat index to xe_bo

Commit Message

Comments

Patch