Message ID | 20240221073324.3303-1-thomas.hellstrom@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/ttm: Fix an invalid freeing on already freed page in error path | expand |
On 21/02/2024 07:33, Thomas Hellström wrote: > If caching mode change fails due to, for example, OOM we > free the allocated pages in a two-step process. First the pages > for which the caching change has already succeeded. Secondly > the pages for which a caching change did not succeed. > > However the second step was incorrectly freeing the pages already > freed in the first step. > > Fix. > > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") > Cc: Christian König <christian.koenig@amd.com> > Cc: Dave Airlie <airlied@redhat.com> > Cc: Christian Koenig <christian.koenig@amd.com> > Cc: Huang Rui <ray.huang@amd.com> > Cc: dri-devel@lists.freedesktop.org > Cc: <stable@vger.kernel.org> # v6.4+ Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Am 21.02.24 um 08:33 schrieb Thomas Hellström: > If caching mode change fails due to, for example, OOM we > free the allocated pages in a two-step process. First the pages > for which the caching change has already succeeded. Secondly > the pages for which a caching change did not succeed. > > However the second step was incorrectly freeing the pages already > freed in the first step. > > Fix. > > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") > Cc: Christian König <christian.koenig@amd.com> > Cc: Dave Airlie <airlied@redhat.com> > Cc: Christian Koenig <christian.koenig@amd.com> > Cc: Huang Rui <ray.huang@amd.com> > Cc: dri-devel@lists.freedesktop.org > Cc: <stable@vger.kernel.org> # v6.4+ You don't know how much time I've spend staring at this line to find the bug in it and haven't seen it. Got bug reports about that for month as well. Reviewed-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/ttm/ttm_pool.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c > index b62f420a9f96..112438d965ff 100644 > --- a/drivers/gpu/drm/ttm/ttm_pool.c > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > @@ -387,7 +387,7 @@ static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt, > enum ttm_caching caching, > pgoff_t start_page, pgoff_t end_page) > { > - struct page **pages = tt->pages; > + struct page **pages = &tt->pages[start_page]; > unsigned int order; > pgoff_t i, nr; >
On Wed, 2024-02-21 at 11:26 +0100, Christian König wrote: > Am 21.02.24 um 08:33 schrieb Thomas Hellström: > > If caching mode change fails due to, for example, OOM we > > free the allocated pages in a two-step process. First the pages > > for which the caching change has already succeeded. Secondly > > the pages for which a caching change did not succeed. > > > > However the second step was incorrectly freeing the pages already > > freed in the first step. > > > > Fix. > > > > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> > > Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") > > Cc: Christian König <christian.koenig@amd.com> > > Cc: Dave Airlie <airlied@redhat.com> > > Cc: Christian Koenig <christian.koenig@amd.com> > > Cc: Huang Rui <ray.huang@amd.com> > > Cc: dri-devel@lists.freedesktop.org > > Cc: <stable@vger.kernel.org> # v6.4+ > > You don't know how much time I've spend staring at this line to find > the > bug in it and haven't seen it. Got bug reports about that for month > as well. Yeah, sorry about that. We should probably have Kunit tests exercising OOM in the pool code involving WC pages. I'll push this to drm-misc-next. /Thomas > > Reviewed-by: Christian König <christian.koenig@amd.com> > > > --- > > drivers/gpu/drm/ttm/ttm_pool.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c > > b/drivers/gpu/drm/ttm/ttm_pool.c > > index b62f420a9f96..112438d965ff 100644 > > --- a/drivers/gpu/drm/ttm/ttm_pool.c > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > > @@ -387,7 +387,7 @@ static void ttm_pool_free_range(struct ttm_pool > > *pool, struct ttm_tt *tt, > > enum ttm_caching caching, > > pgoff_t start_page, pgoff_t > > end_page) > > { > > - struct page **pages = tt->pages; > > + struct page **pages = &tt->pages[start_page]; > > unsigned int order; > > pgoff_t i, nr; > > >
On Thu, 2024-02-22 at 08:34 +0100, Thomas Hellström wrote: > On Wed, 2024-02-21 at 11:26 +0100, Christian König wrote: > > Am 21.02.24 um 08:33 schrieb Thomas Hellström: > > > If caching mode change fails due to, for example, OOM we > > > free the allocated pages in a two-step process. First the pages > > > for which the caching change has already succeeded. Secondly > > > the pages for which a caching change did not succeed. > > > > > > However the second step was incorrectly freeing the pages already > > > freed in the first step. > > > > > > Fix. > > > > > > Signed-off-by: Thomas Hellström > > > <thomas.hellstrom@linux.intel.com> > > > Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error > > > path") > > > Cc: Christian König <christian.koenig@amd.com> > > > Cc: Dave Airlie <airlied@redhat.com> > > > Cc: Christian Koenig <christian.koenig@amd.com> > > > Cc: Huang Rui <ray.huang@amd.com> > > > Cc: dri-devel@lists.freedesktop.org > > > Cc: <stable@vger.kernel.org> # v6.4+ > > > > You don't know how much time I've spend staring at this line to > > find > > the > > bug in it and haven't seen it. Got bug reports about that for month > > as well. > > > Yeah, sorry about that. We should probably have Kunit tests > exercising > OOM in the pool code involving WC pages. > > I'll push this to drm-misc-next. drm-misc-fixes.. /Thomas > > /Thomas > > > > > Reviewed-by: Christian König <christian.koenig@amd.com> > > > > > --- > > > drivers/gpu/drm/ttm/ttm_pool.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c > > > b/drivers/gpu/drm/ttm/ttm_pool.c > > > index b62f420a9f96..112438d965ff 100644 > > > --- a/drivers/gpu/drm/ttm/ttm_pool.c > > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > > > @@ -387,7 +387,7 @@ static void ttm_pool_free_range(struct > > > ttm_pool > > > *pool, struct ttm_tt *tt, > > > enum ttm_caching caching, > > > pgoff_t start_page, pgoff_t > > > end_page) > > > { > > > - struct page **pages = tt->pages; > > > + struct page **pages = &tt->pages[start_page]; > > > unsigned int order; > > > pgoff_t i, nr; > > > > > >
Am 22.02.24 um 08:34 schrieb Thomas Hellström: > On Wed, 2024-02-21 at 11:26 +0100, Christian König wrote: >> Am 21.02.24 um 08:33 schrieb Thomas Hellström: >>> If caching mode change fails due to, for example, OOM we >>> free the allocated pages in a two-step process. First the pages >>> for which the caching change has already succeeded. Secondly >>> the pages for which a caching change did not succeed. >>> >>> However the second step was incorrectly freeing the pages already >>> freed in the first step. >>> >>> Fix. >>> >>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> >>> Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") >>> Cc: Christian König <christian.koenig@amd.com> >>> Cc: Dave Airlie <airlied@redhat.com> >>> Cc: Christian Koenig <christian.koenig@amd.com> >>> Cc: Huang Rui <ray.huang@amd.com> >>> Cc: dri-devel@lists.freedesktop.org >>> Cc: <stable@vger.kernel.org> # v6.4+ >> You don't know how much time I've spend staring at this line to find >> the >> bug in it and haven't seen it. Got bug reports about that for month >> as well. > > Yeah, sorry about that. We should probably have Kunit tests exercising > OOM in the pool code involving WC pages. > > I'll push this to drm-misc-next. drm-misc-fixes please! That needs to be backported ASAP. Need to dig up the bug report for this again. Thanks, Christian. > > /Thomas > >> Reviewed-by: Christian König <christian.koenig@amd.com> >> >>> --- >>> drivers/gpu/drm/ttm/ttm_pool.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c >>> b/drivers/gpu/drm/ttm/ttm_pool.c >>> index b62f420a9f96..112438d965ff 100644 >>> --- a/drivers/gpu/drm/ttm/ttm_pool.c >>> +++ b/drivers/gpu/drm/ttm/ttm_pool.c >>> @@ -387,7 +387,7 @@ static void ttm_pool_free_range(struct ttm_pool >>> *pool, struct ttm_tt *tt, >>> enum ttm_caching caching, >>> pgoff_t start_page, pgoff_t >>> end_page) >>> { >>> - struct page **pages = tt->pages; >>> + struct page **pages = &tt->pages[start_page]; >>> unsigned int order; >>> pgoff_t i, nr; >>>
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index b62f420a9f96..112438d965ff 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -387,7 +387,7 @@ static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt, enum ttm_caching caching, pgoff_t start_page, pgoff_t end_page) { - struct page **pages = tt->pages; + struct page **pages = &tt->pages[start_page]; unsigned int order; pgoff_t i, nr;
If caching mode change fails due to, for example, OOM we free the allocated pages in a two-step process. First the pages for which the caching change has already succeeded. Secondly the pages for which a caching change did not succeed. However the second step was incorrectly freeing the pages already freed in the first step. Fix. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.4+ --- drivers/gpu/drm/ttm/ttm_pool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)