Message ID | 20230919233556.1458793-5-adrian.larumbe@collabora.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add fdinfo support to Panfrost | expand |
On 20/09/2023 00:34, Adrián Larumbe wrote: > Some BO's might be mapped onto physical memory chunkwise and on demand, > like Panfrost's tiler heap. In this case, even though the > drm_gem_shmem_object page array might already be allocated, only a very > small fraction of the BO is currently backed by system memory, but > drm_show_memory_stats will then proceed to add its entire virtual size to > the file's total resident size regardless. > > This led to very unrealistic RSS sizes being reckoned for Panfrost, where > said tiler heap buffer is initially allocated with a virtual size of 128 > MiB, but only a small part of it will eventually be backed by system memory > after successive GPU page faults. > > Provide a new DRM object generic function that would allow drivers to > return a more accurate RSS size for their BOs. > > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> > Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> > Reviewed-by: Steven Price <steven.price@arm.com> > --- > drivers/gpu/drm/drm_file.c | 5 ++++- > include/drm/drm_gem.h | 9 +++++++++ > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c > index 883d83bc0e3d..762965e3d503 100644 > --- a/drivers/gpu/drm/drm_file.c > +++ b/drivers/gpu/drm/drm_file.c > @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) > } > > if (s & DRM_GEM_OBJECT_RESIDENT) { > - status.resident += obj->size; > + if (obj->funcs && obj->funcs->rss) > + status.resident += obj->funcs->rss(obj); > + else > + status.resident += obj->size; Presumably you'd want the same smaller size in both active and purgeable? Or you can end up with more in those two than in rss which would look odd. Also, alternative to adding a new callback could be adding multiple output parameters to the existing obj->func->status() which maybe ends up simpler due fewer callbacks? Like: s = obj->funcs->status(obj, &supported_status, &rss) And adjust the code flow to pick up the rss if driver signaled it supports reporting it. Regards, Tvrtko > } else { > /* If already purged or not yet backed by pages, don't > * count it as purgeable: > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h > index bc9f6aa2f3fe..16364487fde9 100644 > --- a/include/drm/drm_gem.h > +++ b/include/drm/drm_gem.h > @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { > */ > enum drm_gem_object_status (*status)(struct drm_gem_object *obj); > > + /** > + * @rss: > + * > + * Return resident size of the object in physical memory. > + * > + * Called by drm_show_memory_stats(). > + */ > + size_t (*rss)(struct drm_gem_object *obj); > + > /** > * @vm_ops: > *
On 20.09.2023 16:53, Tvrtko Ursulin wrote: > >On 20/09/2023 00:34, Adrián Larumbe wrote: >> Some BO's might be mapped onto physical memory chunkwise and on demand, >> like Panfrost's tiler heap. In this case, even though the >> drm_gem_shmem_object page array might already be allocated, only a very >> small fraction of the BO is currently backed by system memory, but >> drm_show_memory_stats will then proceed to add its entire virtual size to >> the file's total resident size regardless. >> >> This led to very unrealistic RSS sizes being reckoned for Panfrost, where >> said tiler heap buffer is initially allocated with a virtual size of 128 >> MiB, but only a small part of it will eventually be backed by system memory >> after successive GPU page faults. >> >> Provide a new DRM object generic function that would allow drivers to >> return a more accurate RSS size for their BOs. >> >> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >> Reviewed-by: Steven Price <steven.price@arm.com> >> --- >> drivers/gpu/drm/drm_file.c | 5 ++++- >> include/drm/drm_gem.h | 9 +++++++++ >> 2 files changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >> index 883d83bc0e3d..762965e3d503 100644 >> --- a/drivers/gpu/drm/drm_file.c >> +++ b/drivers/gpu/drm/drm_file.c >> @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) >> } >> if (s & DRM_GEM_OBJECT_RESIDENT) { >> - status.resident += obj->size; >> + if (obj->funcs && obj->funcs->rss) >> + status.resident += obj->funcs->rss(obj); >> + else >> + status.resident += obj->size; > >Presumably you'd want the same smaller size in both active and purgeable? Or >you can end up with more in those two than in rss which would look odd. I didn't think of this. I guess when an object is both resident and purgeable, then its RSS and purgeable sizes should be the same. >Also, alternative to adding a new callback could be adding multiple output >parameters to the existing obj->func->status() which maybe ends up simpler due >fewer callbacks? > >Like: > > s = obj->funcs->status(obj, &supported_status, &rss) > >And adjust the code flow to pick up the rss if driver signaled it supports >reporting it. I personally find having a separate object callback more readable in this case. There's also the question of what output parameter value would be used as a token that the relevant BO doesn't have an RSS different from its virtual size. I guess '0' would be alright, but this is on the assumption that this could never be a legitimate BO virtual size across all DRM drivers. I guess most of them round the size up to the nearest page multiple at BO creation time. > >Regards, > >Tvrtko > >> } else { >> /* If already purged or not yet backed by pages, don't >> * count it as purgeable: >> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h >> index bc9f6aa2f3fe..16364487fde9 100644 >> --- a/include/drm/drm_gem.h >> +++ b/include/drm/drm_gem.h >> @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { >> */ >> enum drm_gem_object_status (*status)(struct drm_gem_object *obj); >> + /** >> + * @rss: >> + * >> + * Return resident size of the object in physical memory. >> + * >> + * Called by drm_show_memory_stats(). >> + */ >> + size_t (*rss)(struct drm_gem_object *obj); >> + >> /** >> * @vm_ops: >> *
On 22/09/2023 11:58, Adrián Larumbe wrote: > On 20.09.2023 16:53, Tvrtko Ursulin wrote: >> >> On 20/09/2023 00:34, Adrián Larumbe wrote: >>> Some BO's might be mapped onto physical memory chunkwise and on demand, >>> like Panfrost's tiler heap. In this case, even though the >>> drm_gem_shmem_object page array might already be allocated, only a very >>> small fraction of the BO is currently backed by system memory, but >>> drm_show_memory_stats will then proceed to add its entire virtual size to >>> the file's total resident size regardless. >>> >>> This led to very unrealistic RSS sizes being reckoned for Panfrost, where >>> said tiler heap buffer is initially allocated with a virtual size of 128 >>> MiB, but only a small part of it will eventually be backed by system memory >>> after successive GPU page faults. >>> >>> Provide a new DRM object generic function that would allow drivers to >>> return a more accurate RSS size for their BOs. >>> >>> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >>> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >>> Reviewed-by: Steven Price <steven.price@arm.com> >>> --- >>> drivers/gpu/drm/drm_file.c | 5 ++++- >>> include/drm/drm_gem.h | 9 +++++++++ >>> 2 files changed, 13 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >>> index 883d83bc0e3d..762965e3d503 100644 >>> --- a/drivers/gpu/drm/drm_file.c >>> +++ b/drivers/gpu/drm/drm_file.c >>> @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) >>> } >>> if (s & DRM_GEM_OBJECT_RESIDENT) { >>> - status.resident += obj->size; >>> + if (obj->funcs && obj->funcs->rss) >>> + status.resident += obj->funcs->rss(obj); >>> + else >>> + status.resident += obj->size; >> >> Presumably you'd want the same smaller size in both active and purgeable? Or >> you can end up with more in those two than in rss which would look odd. > > I didn't think of this. I guess when an object is both resident and purgeable, > then its RSS and purgeable sizes should be the same. > >> Also, alternative to adding a new callback could be adding multiple output >> parameters to the existing obj->func->status() which maybe ends up simpler due >> fewer callbacks? >> >> Like: >> >> s = obj->funcs->status(obj, &supported_status, &rss) >> >> And adjust the code flow to pick up the rss if driver signaled it supports >> reporting it. > > I personally find having a separate object callback more readable in this case. > There's also the question of what output parameter value would be used as a token > that the relevant BO doesn't have an RSS different from its virtual > size. I guess '0' would be alright, but this is on the assumption that this > could never be a legitimate BO virtual size across all DRM drivers. I guess > most of them round the size up to the nearest page multiple at BO creation > time. Okay. See how it will look once you need to apply it to resident and purgeable. I wonder if "driver knows better" will end up a dominant case and we do end up considering reversing the scheme (like ask the driver to fill in the meminfo record). TBH I do not remember all the flavours both Rob and I proposed at this point. Regards, Tvrtko > >> >> Regards, >> >> Tvrtko >> >>> } else { >>> /* If already purged or not yet backed by pages, don't >>> * count it as purgeable: >>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h >>> index bc9f6aa2f3fe..16364487fde9 100644 >>> --- a/include/drm/drm_gem.h >>> +++ b/include/drm/drm_gem.h >>> @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { >>> */ >>> enum drm_gem_object_status (*status)(struct drm_gem_object *obj); >>> + /** >>> + * @rss: >>> + * >>> + * Return resident size of the object in physical memory. >>> + * >>> + * Called by drm_show_memory_stats(). >>> + */ >>> + size_t (*rss)(struct drm_gem_object *obj); >>> + >>> /** >>> * @vm_ops: >>> *
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 883d83bc0e3d..762965e3d503 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -944,7 +944,10 @@ void drm_show_memory_stats(struct drm_printer *p, struct drm_file *file) } if (s & DRM_GEM_OBJECT_RESIDENT) { - status.resident += obj->size; + if (obj->funcs && obj->funcs->rss) + status.resident += obj->funcs->rss(obj); + else + status.resident += obj->size; } else { /* If already purged or not yet backed by pages, don't * count it as purgeable: diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index bc9f6aa2f3fe..16364487fde9 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -208,6 +208,15 @@ struct drm_gem_object_funcs { */ enum drm_gem_object_status (*status)(struct drm_gem_object *obj); + /** + * @rss: + * + * Return resident size of the object in physical memory. + * + * Called by drm_show_memory_stats(). + */ + size_t (*rss)(struct drm_gem_object *obj); + /** * @vm_ops: *