diff mbox series

drm/buddy: Fix alloc_range() error handling code

Message ID 20240209152624.1970-1-Arunpravin.PaneerSelvam@amd.com (mailing list archive)
State New, archived
Headers show
Series drm/buddy: Fix alloc_range() error handling code | expand

Commit Message

Paneer Selvam, Arunpravin Feb. 9, 2024, 3:26 p.m. UTC
Few users have observed display corruption when they boot
the machine to KDE Plasma or playing games. We have root
caused the problem that whenever alloc_range() couldn't
find the required memory blocks the function was returning
SUCCESS in some of the corner cases.

The right approach would be if the total allocated size
is less than the required size, the function should
return -ENOSPC.

Cc:  <stable@vger.kernel.org> # 6.7+
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-Arunpravin.PaneerSelvam@amd.com/
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
---
 drivers/gpu/drm/drm_buddy.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Daniel Vetter Feb. 9, 2024, 6:04 p.m. UTC | #1
On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> Few users have observed display corruption when they boot
> the machine to KDE Plasma or playing games. We have root
> caused the problem that whenever alloc_range() couldn't
> find the required memory blocks the function was returning
> SUCCESS in some of the corner cases.
> 
> The right approach would be if the total allocated size
> is less than the required size, the function should
> return -ENOSPC.
> 
> Cc:  <stable@vger.kernel.org> # 6.7+
> Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> Tested-by: Mario Limonciello <mario.limonciello@amd.com>
> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-Arunpravin.PaneerSelvam@amd.com/
> Acked-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>

New unit test for this would be most excellent - these kind of missed edge
cases is exactly what kunit is for. Can you please follow up with, since
we don't want to hold up the bugfix for longer?
-Sima

> ---
>  drivers/gpu/drm/drm_buddy.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index f57e6d74fb0e..c1a99bf4dffd 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
>  	} while (1);
>  
>  	list_splice_tail(&allocated, blocks);
> +
> +	if (total_allocated < size) {
> +		err = -ENOSPC;
> +		goto err_free;
> +	}
> +
>  	return 0;
>  
>  err_undo:
> -- 
> 2.25.1
>
Paneer Selvam, Arunpravin Feb. 9, 2024, 6:36 p.m. UTC | #2
Hi Daniel,

On 2/9/2024 11:34 PM, Daniel Vetter wrote:
> On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
>> Few users have observed display corruption when they boot
>> the machine to KDE Plasma or playing games. We have root
>> caused the problem that whenever alloc_range() couldn't
>> find the required memory blocks the function was returning
>> SUCCESS in some of the corner cases.
>>
>> The right approach would be if the total allocated size
>> is less than the required size, the function should
>> return -ENOSPC.
>>
>> Cc:  <stable@vger.kernel.org> # 6.7+
>> Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
>> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
>> Tested-by: Mario Limonciello <mario.limonciello@amd.com>
>> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-Arunpravin.PaneerSelvam@amd.com/
>> Acked-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
> New unit test for this would be most excellent - these kind of missed edge
> cases is exactly what kunit is for. Can you please follow up with, since
> we don't want to hold up the bugfix for longer?
Matthew Auld has added a new unit test for this case. Please let us know 
if this will suffice.
https://patchwork.freedesktop.org/patch/577497/?series=129671&rev=1

Thanks,
Arun.
> -Sima
>
>> ---
>>   drivers/gpu/drm/drm_buddy.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>> index f57e6d74fb0e..c1a99bf4dffd 100644
>> --- a/drivers/gpu/drm/drm_buddy.c
>> +++ b/drivers/gpu/drm/drm_buddy.c
>> @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
>>   	} while (1);
>>   
>>   	list_splice_tail(&allocated, blocks);
>> +
>> +	if (total_allocated < size) {
>> +		err = -ENOSPC;
>> +		goto err_free;
>> +	}
>> +
>>   	return 0;
>>   
>>   err_undo:
>> -- 
>> 2.25.1
>>
Daniel Vetter Feb. 9, 2024, 7:40 p.m. UTC | #3
On Sat, Feb 10, 2024 at 12:06:58AM +0530, Arunpravin Paneer Selvam wrote:
> Hi Daniel,
> 
> On 2/9/2024 11:34 PM, Daniel Vetter wrote:
> > On Fri, Feb 09, 2024 at 08:56:24PM +0530, Arunpravin Paneer Selvam wrote:
> > > Few users have observed display corruption when they boot
> > > the machine to KDE Plasma or playing games. We have root
> > > caused the problem that whenever alloc_range() couldn't
> > > find the required memory blocks the function was returning
> > > SUCCESS in some of the corner cases.
> > > 
> > > The right approach would be if the total allocated size
> > > is less than the required size, the function should
> > > return -ENOSPC.
> > > 
> > > Cc:  <stable@vger.kernel.org> # 6.7+
> > > Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3097
> > > Tested-by: Mario Limonciello <mario.limonciello@amd.com>
> > > Link: https://patchwork.kernel.org/project/dri-devel/patch/20240207174456.341121-1-Arunpravin.PaneerSelvam@amd.com/
> > > Acked-by: Christian König <christian.koenig@amd.com>
> > > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
> > New unit test for this would be most excellent - these kind of missed edge
> > cases is exactly what kunit is for. Can you please follow up with, since
> > we don't want to hold up the bugfix for longer?
> Matthew Auld has added a new unit test for this case. Please let us know if
> this will suffice.
> https://patchwork.freedesktop.org/patch/577497/?series=129671&rev=1

Ah yeah, might be best to submit them both together as one series (you
just need to add your own signed-off-by if you resend other people's
patches). That way bots can pick it up together, since new testcase and
bugfix only make sense together.
-Sima

> 
> Thanks,
> Arun.
> > -Sima
> > 
> > > ---
> > >   drivers/gpu/drm/drm_buddy.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> > > index f57e6d74fb0e..c1a99bf4dffd 100644
> > > --- a/drivers/gpu/drm/drm_buddy.c
> > > +++ b/drivers/gpu/drm/drm_buddy.c
> > > @@ -539,6 +539,12 @@ static int __alloc_range(struct drm_buddy *mm,
> > >   	} while (1);
> > >   	list_splice_tail(&allocated, blocks);
> > > +
> > > +	if (total_allocated < size) {
> > > +		err = -ENOSPC;
> > > +		goto err_free;
> > > +	}
> > > +
> > >   	return 0;
> > >   err_undo:
> > > -- 
> > > 2.25.1
> > > 
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f57e6d74fb0e..c1a99bf4dffd 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -539,6 +539,12 @@  static int __alloc_range(struct drm_buddy *mm,
 	} while (1);
 
 	list_splice_tail(&allocated, blocks);
+
+	if (total_allocated < size) {
+		err = -ENOSPC;
+		goto err_free;
+	}
+
 	return 0;
 
 err_undo: