diff mbox

[10/13] drm/i915: fix the CFB size check

Message ID 1446664257-32012-11-git-send-email-paulo.r.zanoni@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zanoni, Paulo R Nov. 4, 2015, 7:10 p.m. UTC
In function find_compression_threshold() we try to over-allocate CFB
space in order to reudce reallocations and fragmentation, and we're
not considering that at the CFB size check. Consider it.

There is also a longer-term plan to kill
dev_priv->fbc.uncompressed_size, but this will come later.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Maarten Lankhorst Nov. 10, 2015, 10:22 a.m. UTC | #1
Op 04-11-15 om 20:10 schreef Paulo Zanoni:
> In function find_compression_threshold() we try to over-allocate CFB
> space in order to reudce reallocations and fragmentation, and we're
> not considering that at the CFB size check. Consider it.
>
> There is also a longer-term plan to kill
> dev_priv->fbc.uncompressed_size, but this will come later.
>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
> index dee99c9..e99aacc 100644
> --- a/drivers/gpu/drm/i915/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/intel_fbc.c
> @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct intel_crtc *crtc)
>  	size = intel_fbc_calculate_cfb_size(crtc);
>  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);
>  
> -	if (size <= dev_priv->fbc.uncompressed_size)
> +	if (dev_priv->fbc.compressed_fb.allocated &&
> +	    size <= dev_priv->fbc.compressed_fb.size * dev_priv->fbc.threshold)
>  		return 0;
>  
>  	/* Release any current block */
Should i8xx_fbc_enable be changed too then?

Rest of the patches look ok, applied those.
Zanoni, Paulo R Nov. 10, 2015, 12:20 p.m. UTC | #2
Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:
> Op 04-11-15 om 20:10 schreef Paulo Zanoni:

> > In function find_compression_threshold() we try to over-allocate

> > CFB

> > space in order to reudce reallocations and fragmentation, and we're

> > not considering that at the CFB size check. Consider it.

> > 

> > There is also a longer-term plan to kill

> > dev_priv->fbc.uncompressed_size, but this will come later.

> > 

> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > ---

> >  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-

> >  1 file changed, 2 insertions(+), 1 deletion(-)

> > 

> > diff --git a/drivers/gpu/drm/i915/intel_fbc.c

> > b/drivers/gpu/drm/i915/intel_fbc.c

> > index dee99c9..e99aacc 100644

> > --- a/drivers/gpu/drm/i915/intel_fbc.c

> > +++ b/drivers/gpu/drm/i915/intel_fbc.c

> > @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct

> > intel_crtc *crtc)

> >  	size = intel_fbc_calculate_cfb_size(crtc);

> >  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);

> >  

> > -	if (size <= dev_priv->fbc.uncompressed_size)

> > +	if (dev_priv->fbc.compressed_fb.allocated &&

> > +	    size <= dev_priv->fbc.compressed_fb.size * dev_priv-

> > >fbc.threshold)

> >  		return 0;

> >  

> >  	/* Release any current block */

> Should i8xx_fbc_enable be changed too then?


As far as I understand, no. We're just reserving a bigger buffer in
case we need it later, but the size used by the hardware is still the
same. But I'm not 100% sure the i8xx code is actually correct since I
didn't dig deep into the ancient scrolls. By not touching i8xx we're
also avoiding a possible new regression.

> 

> Rest of the patches look ok, applied those.
Maarten Lankhorst Nov. 10, 2015, 1:04 p.m. UTC | #3
Op 10-11-15 om 13:20 schreef Zanoni, Paulo R:
> Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:
>> Op 04-11-15 om 20:10 schreef Paulo Zanoni:
>>> In function find_compression_threshold() we try to over-allocate
>>> CFB
>>> space in order to reudce reallocations and fragmentation, and we're
>>> not considering that at the CFB size check. Consider it.
>>>
>>> There is also a longer-term plan to kill
>>> dev_priv->fbc.uncompressed_size, but this will come later.
>>>
>>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c
>>> b/drivers/gpu/drm/i915/intel_fbc.c
>>> index dee99c9..e99aacc 100644
>>> --- a/drivers/gpu/drm/i915/intel_fbc.c
>>> +++ b/drivers/gpu/drm/i915/intel_fbc.c
>>> @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct
>>> intel_crtc *crtc)
>>>  	size = intel_fbc_calculate_cfb_size(crtc);
>>>  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);
>>>  
>>> -	if (size <= dev_priv->fbc.uncompressed_size)
>>> +	if (dev_priv->fbc.compressed_fb.allocated &&
>>> +	    size <= dev_priv->fbc.compressed_fb.size * dev_priv-
>>>> fbc.threshold)
>>>  		return 0;
>>>  
>>>  	/* Release any current block */
>> Should i8xx_fbc_enable be changed too then?
> As far as I understand, no. We're just reserving a bigger buffer in
> case we need it later, but the size used by the hardware is still the
> same. But I'm not 100% sure the i8xx code is actually correct since I
> didn't dig deep into the ancient scrolls. By not touching i8xx we're
> also avoiding a possible new regression.
>
The original check was for size <= uncompressed_size,
new is threshold * compressed.

I think i8xx_fbc_enable might have to do the same when calculating cfb_pitch
for this patch to work as intended.

~Maarten
Zanoni, Paulo R Nov. 11, 2015, 1:39 p.m. UTC | #4
Em Ter, 2015-11-10 às 14:04 +0100, Maarten Lankhorst escreveu:
> Op 10-11-15 om 13:20 schreef Zanoni, Paulo R:

> > Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:

> > > Op 04-11-15 om 20:10 schreef Paulo Zanoni:

> > > > In function find_compression_threshold() we try to over-

> > > > allocate

> > > > CFB

> > > > space in order to reudce reallocations and fragmentation, and

> > > > we're

> > > > not considering that at the CFB size check. Consider it.

> > > > 

> > > > There is also a longer-term plan to kill

> > > > dev_priv->fbc.uncompressed_size, but this will come later.

> > > > 

> > > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > > > ---

> > > >  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-

> > > >  1 file changed, 2 insertions(+), 1 deletion(-)

> > > > 

> > > > diff --git a/drivers/gpu/drm/i915/intel_fbc.c

> > > > b/drivers/gpu/drm/i915/intel_fbc.c

> > > > index dee99c9..e99aacc 100644

> > > > --- a/drivers/gpu/drm/i915/intel_fbc.c

> > > > +++ b/drivers/gpu/drm/i915/intel_fbc.c

> > > > @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct

> > > > intel_crtc *crtc)

> > > >  	size = intel_fbc_calculate_cfb_size(crtc);

> > > >  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);

> > > >  

> > > > -	if (size <= dev_priv->fbc.uncompressed_size)

> > > > +	if (dev_priv->fbc.compressed_fb.allocated &&

> > > > +	    size <= dev_priv->fbc.compressed_fb.size *

> > > > dev_priv-

> > > > > fbc.threshold)

> > > >  		return 0;

> > > >  

> > > >  	/* Release any current block */

> > > Should i8xx_fbc_enable be changed too then?

> > As far as I understand, no. We're just reserving a bigger buffer in

> > case we need it later, but the size used by the hardware is still

> > the

> > same. But I'm not 100% sure the i8xx code is actually correct since

> > I

> > didn't dig deep into the ancient scrolls. By not touching i8xx

> > we're

> > also avoiding a possible new regression.

> > 

> The original check was for size <= uncompressed_size,

> new is threshold * compressed.


But threshold is always 1 for i8xx, so the difference of
uncompressed_size vs compressed_fb.size is really just the
overallocation, and the hardware doesn't even need to know we're
overallocating.

> 

> I think i8xx_fbc_enable might have to do the same when calculating

> cfb_pitch

> for this patch to work as intended.


Why?

Please notice I'm trying to keep the i8xx behavior the same instead of
risking introducing a new bug I can't test.

> 

> ~Maarten
Maarten Lankhorst Nov. 11, 2015, 2:25 p.m. UTC | #5
Op 11-11-15 om 14:39 schreef Zanoni, Paulo R:
> Em Ter, 2015-11-10 às 14:04 +0100, Maarten Lankhorst escreveu:
>> Op 10-11-15 om 13:20 schreef Zanoni, Paulo R:
>>> Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:
>>>> Op 04-11-15 om 20:10 schreef Paulo Zanoni:
>>>>> In function find_compression_threshold() we try to over-
>>>>> allocate
>>>>> CFB
>>>>> space in order to reudce reallocations and fragmentation, and
>>>>> we're
>>>>> not considering that at the CFB size check. Consider it.
>>>>>
>>>>> There is also a longer-term plan to kill
>>>>> dev_priv->fbc.uncompressed_size, but this will come later.
>>>>>
>>>>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>>>> ---
>>>>>  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c
>>>>> b/drivers/gpu/drm/i915/intel_fbc.c
>>>>> index dee99c9..e99aacc 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_fbc.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_fbc.c
>>>>> @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct
>>>>> intel_crtc *crtc)
>>>>>  	size = intel_fbc_calculate_cfb_size(crtc);
>>>>>  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);
>>>>>  
>>>>> -	if (size <= dev_priv->fbc.uncompressed_size)
>>>>> +	if (dev_priv->fbc.compressed_fb.allocated &&
>>>>> +	    size <= dev_priv->fbc.compressed_fb.size *
>>>>> dev_priv-
>>>>>> fbc.threshold)
>>>>>  		return 0;
>>>>>  
>>>>>  	/* Release any current block */
>>>> Should i8xx_fbc_enable be changed too then?
>>> As far as I understand, no. We're just reserving a bigger buffer in
>>> case we need it later, but the size used by the hardware is still
>>> the
>>> same. But I'm not 100% sure the i8xx code is actually correct since
>>> I
>>> didn't dig deep into the ancient scrolls. By not touching i8xx
>>> we're
>>> also avoiding a possible new regression.
>>>
>> The original check was for size <= uncompressed_size,
>> new is threshold * compressed.
> But threshold is always 1 for i8xx, so the difference of
> uncompressed_size vs compressed_fb.size is really just the
> overallocation, and the hardware doesn't even need to know we're
> overallocating.
Ah ok, wasn't aware of threshold being 1. :) In that case I guess nothing changes so patch should be safe to apply.

~Maarten
Ville Syrjala Nov. 11, 2015, 2:27 p.m. UTC | #6
On Tue, Nov 10, 2015 at 02:04:48PM +0100, Maarten Lankhorst wrote:
> Op 10-11-15 om 13:20 schreef Zanoni, Paulo R:
> > Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:
> >> Op 04-11-15 om 20:10 schreef Paulo Zanoni:
> >>> In function find_compression_threshold() we try to over-allocate
> >>> CFB
> >>> space in order to reudce reallocations and fragmentation, and we're
> >>> not considering that at the CFB size check. Consider it.
> >>>
> >>> There is also a longer-term plan to kill
> >>> dev_priv->fbc.uncompressed_size, but this will come later.
> >>>
> >>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>> ---
> >>>  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c
> >>> b/drivers/gpu/drm/i915/intel_fbc.c
> >>> index dee99c9..e99aacc 100644
> >>> --- a/drivers/gpu/drm/i915/intel_fbc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_fbc.c
> >>> @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct
> >>> intel_crtc *crtc)
> >>>  	size = intel_fbc_calculate_cfb_size(crtc);
> >>>  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);
> >>>  
> >>> -	if (size <= dev_priv->fbc.uncompressed_size)
> >>> +	if (dev_priv->fbc.compressed_fb.allocated &&
> >>> +	    size <= dev_priv->fbc.compressed_fb.size * dev_priv-
> >>>> fbc.threshold)
> >>>  		return 0;
> >>>  
> >>>  	/* Release any current block */
> >> Should i8xx_fbc_enable be changed too then?
> > As far as I understand, no. We're just reserving a bigger buffer in
> > case we need it later, but the size used by the hardware is still the
> > same. But I'm not 100% sure the i8xx code is actually correct since I
> > didn't dig deep into the ancient scrolls. By not touching i8xx we're
> > also avoiding a possible new regression.
> >
> The original check was for size <= uncompressed_size,
> new is threshold * compressed.
> 
> I think i8xx_fbc_enable might have to do the same when calculating cfb_pitch
> for this patch to work as intended.

I think the FBC1 code is fairly confused already.

It does the right thing by capping CFB_PITCH to the fb pitch. The way
the hardware works is that any line that compresses to >CFB_PITCH is
discarded and marked as uncompressed. So we definitely don't want to 
allow CFB_PITCH to exceed the original pitch as that would just increase
the amount of data getting scanned out. The nasty thing is that it'll
try to recompress any line tagged as uncompressed every time it tries
to recompress anything, so we should be rather careful not to set
CFB_PITCH too low for FBC1, and if we have poorly compressing data we
might just want to disable FBC entirely to avoid trying to recompress
everything all the time.

What doesn't really make sense to me is the
'cfb_pitch = dev_priv->fbc.uncompressed_size / FBC_LL_SIZE' part. 
That basically assumes we always have a maximum height plane (1536
lines) in use. I'm not entirely sure if the hardware fully skips the
unused lines, or if we would have to allocate some pixel run sets
(swords) even for the unused lines (maybe 1 per line). But I wouldn't
think that we would need to allocate them to cover the entire worst
case length.

What makes things more confusing I think is the naming of the variables
and functions. uncompressed_size is actually the size we've allocated
for the cfb, and intel_fbc_calculate_cfb_size() sort of returns the
uncompressed fb size. Well since it has the 2k line limit applied, I
suppose you can actually think of it as the max cfb size we would want
to have for the particular plane configuration.
Zanoni, Paulo R Nov. 11, 2015, 3:36 p.m. UTC | #7
Em Qua, 2015-11-11 às 16:27 +0200, Ville Syrjälä escreveu:
> On Tue, Nov 10, 2015 at 02:04:48PM +0100, Maarten Lankhorst wrote:

> > Op 10-11-15 om 13:20 schreef Zanoni, Paulo R:

> > > Em Ter, 2015-11-10 às 11:22 +0100, Maarten Lankhorst escreveu:

> > > > Op 04-11-15 om 20:10 schreef Paulo Zanoni:

> > > > > In function find_compression_threshold() we try to over-

> > > > > allocate

> > > > > CFB

> > > > > space in order to reudce reallocations and fragmentation, and

> > > > > we're

> > > > > not considering that at the CFB size check. Consider it.

> > > > > 

> > > > > There is also a longer-term plan to kill

> > > > > dev_priv->fbc.uncompressed_size, but this will come later.

> > > > > 

> > > > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > > > > ---

> > > > >  drivers/gpu/drm/i915/intel_fbc.c | 3 ++-

> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)

> > > > > 

> > > > > diff --git a/drivers/gpu/drm/i915/intel_fbc.c

> > > > > b/drivers/gpu/drm/i915/intel_fbc.c

> > > > > index dee99c9..e99aacc 100644

> > > > > --- a/drivers/gpu/drm/i915/intel_fbc.c

> > > > > +++ b/drivers/gpu/drm/i915/intel_fbc.c

> > > > > @@ -719,7 +719,8 @@ static int intel_fbc_setup_cfb(struct

> > > > > intel_crtc *crtc)

> > > > >  	size = intel_fbc_calculate_cfb_size(crtc);

> > > > >  	cpp = drm_format_plane_cpp(fb->pixel_format, 0);

> > > > >  

> > > > > -	if (size <= dev_priv->fbc.uncompressed_size)

> > > > > +	if (dev_priv->fbc.compressed_fb.allocated &&

> > > > > +	    size <= dev_priv->fbc.compressed_fb.size *

> > > > > dev_priv-

> > > > > > fbc.threshold)

> > > > >  		return 0;

> > > > >  

> > > > >  	/* Release any current block */

> > > > Should i8xx_fbc_enable be changed too then?

> > > As far as I understand, no. We're just reserving a bigger buffer

> > > in

> > > case we need it later, but the size used by the hardware is still

> > > the

> > > same. But I'm not 100% sure the i8xx code is actually correct

> > > since I

> > > didn't dig deep into the ancient scrolls. By not touching i8xx

> > > we're

> > > also avoiding a possible new regression.

> > > 

> > The original check was for size <= uncompressed_size,

> > new is threshold * compressed.

> > 

> > I think i8xx_fbc_enable might have to do the same when calculating

> > cfb_pitch

> > for this patch to work as intended.

> 

> I think the FBC1 code is fairly confused already.

> 

> It does the right thing by capping CFB_PITCH to the fb pitch. The way

> the hardware works is that any line that compresses to >CFB_PITCH is

> discarded and marked as uncompressed. So we definitely don't want to 

> allow CFB_PITCH to exceed the original pitch as that would just

> increase

> the amount of data getting scanned out. The nasty thing is that it'll

> try to recompress any line tagged as uncompressed every time it tries

> to recompress anything, so we should be rather careful not to set

> CFB_PITCH too low for FBC1, and if we have poorly compressing data we

> might just want to disable FBC entirely to avoid trying to recompress

> everything all the time.

> 

> What doesn't really make sense to me is the

> 'cfb_pitch = dev_priv->fbc.uncompressed_size / FBC_LL_SIZE' part. 

> That basically assumes we always have a maximum height plane (1536

> lines) in use. I'm not entirely sure if the hardware fully skips the

> unused lines, or if we would have to allocate some pixel run sets

> (swords) even for the unused lines (maybe 1 per line). But I wouldn't

> think that we would need to allocate them to cover the entire worst

> case length.


I really didn't stop to audit the i8xx code, and I was not planning to.
I'm just trying to not add any new regressions. Maybe we should open a
bug report or Jira task somewhere for your observation.



> What makes things more confusing I think is the naming of the

> variables

> and functions. uncompressed_size is actually the size we've allocated

> for the cfb


Except for the cases where threshold != 1, and for the cases where we
over-allocate the CFB.


> , and intel_fbc_calculate_cfb_size() sort of returns the

> uncompressed fb size


Yes, it's redundant. See below.


> . Well since it has the 2k line limit applied, I

> suppose you can actually think of it as the max cfb size we would

> want

> to have for the particular plane configuration.


Depends on how you see it. It's the maximum CFB size since we can
increase the threshold. It's also the minimum viable CFB size for
threshold=1. Since it's the base number used for the hardware
calculations, I decided to call it just "the cfb size". But I'm always
open to suggestions for better names.

By the way, I have a patch that completely kills uncompressed_size and
makes the i8xx code just call intel_fbc_calculate_cfb_size() directly.
IMHO, having less variables to track makes things simpler. The patch
will be on the list soon.

>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index dee99c9..e99aacc 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -719,7 +719,8 @@  static int intel_fbc_setup_cfb(struct intel_crtc *crtc)
 	size = intel_fbc_calculate_cfb_size(crtc);
 	cpp = drm_format_plane_cpp(fb->pixel_format, 0);
 
-	if (size <= dev_priv->fbc.uncompressed_size)
+	if (dev_priv->fbc.compressed_fb.allocated &&
+	    size <= dev_priv->fbc.compressed_fb.size * dev_priv->fbc.threshold)
 		return 0;
 
 	/* Release any current block */