Message ID | 20230911094444.68966-2-zhengqi.arch@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | use refcount+RCU method to implement lockless slab shrink | expand |
On 2023/9/11 17:44, Qi Zheng wrote: > Currently, the shrinker instances can be divided into the following three > types: > > a) global shrinker instance statically defined in the kernel, such as > workingset_shadow_shrinker. > > b) global shrinker instance statically defined in the kernel modules, such > as mmu_shrinker in x86. > > c) shrinker instance embedded in other structures. > > For case a, the memory of shrinker instance is never freed. For case b, > the memory of shrinker instance will be freed after synchronize_rcu() when > the module is unloaded. For case c, the memory of shrinker instance will > be freed along with the structure it is embedded in. > > In preparation for implementing lockless slab shrink, we need to > dynamically allocate those shrinker instances in case c, then the memory > can be dynamically freed alone by calling kfree_rcu(). > > So this commit adds the following new APIs for dynamically allocating > shrinker, and add a private_data field to struct shrinker to record and > get the original embedded structure. > > 1. shrinker_alloc() > > Used to allocate shrinker instance itself and related memory, it will > return a pointer to the shrinker instance on success and NULL on failure. > > 2. shrinker_register() > > Used to register the shrinker instance, which is same as the current > register_shrinker_prepared(). > > 3. shrinker_free() > > Used to unregister (if needed) and free the shrinker instance. > > In order to simplify shrinker-related APIs and make shrinker more > independent of other kernel mechanisms, subsequent submissions will use > the above API to convert all shrinkers (including case a and b) to > dynamically allocated, and then remove all existing APIs. > > This will also have another advantage mentioned by Dave Chinner: > > ``` > The other advantage of this is that it will break all the existing > out of tree code and third party modules using the old API and will > no longer work with a kernel using lockless slab shrinkers. They > need to break (both at the source and binary levels) to stop bad > things from happening due to using unconverted shrinkers in the new > setup. > ``` > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > --- > include/linux/shrinker.h | 7 +++ > mm/internal.h | 11 +++++ > mm/shrinker.c | 102 +++++++++++++++++++++++++++++++++++++++ > mm/shrinker_debug.c | 17 ++++++- > 4 files changed, 135 insertions(+), 2 deletions(-) > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index 6b5843c3b827..3f3fd9974ce5 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -70,6 +70,8 @@ struct shrinker { > int seeks; /* seeks to recreate an obj */ > unsigned flags; > > + void *private_data; > + > /* These are for internal use */ > struct list_head list; > #ifdef CONFIG_MEMCG > @@ -95,6 +97,11 @@ struct shrinker { > * non-MEMCG_AWARE shrinker should not have this flag set. > */ > #define SHRINKER_NONSLAB (1 << 3) > +#define SHRINKER_ALLOCATED (1 << 4) It is better to add a comment here to tell users it is only used by internals of shrinker. The users are not supposed to pass this flag to shrink APIs. > + > +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); > +void shrinker_register(struct shrinker *shrinker); > +void shrinker_free(struct shrinker *shrinker); > > extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, > const char *fmt, ...); > diff --git a/mm/internal.h b/mm/internal.h > index 0471d6326d01..5587cae20ebf 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1161,6 +1161,9 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, > > #ifdef CONFIG_SHRINKER_DEBUG > extern int shrinker_debugfs_add(struct shrinker *shrinker); > +extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, > + const char *fmt, va_list ap); > +extern void shrinker_debugfs_name_free(struct shrinker *shrinker); > extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, > int *debugfs_id); > extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, > @@ -1170,6 +1173,14 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker) > { > return 0; > } > +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, > + const char *fmt, va_list ap) > +{ > + return 0; > +} > +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) > +{ > +} > static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, > int *debugfs_id) > { > diff --git a/mm/shrinker.c b/mm/shrinker.c > index a16cd448b924..201211a67827 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -550,6 +550,108 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, > return freed; > } > > +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) > +{ > + struct shrinker *shrinker; > + unsigned int size; > + va_list ap; > + int err; > + > + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); > + if (!shrinker) > + return NULL; > + > + va_start(ap, fmt); > + err = shrinker_debugfs_name_alloc(shrinker, fmt, ap); > + va_end(ap); > + if (err) > + goto err_name; > + > + shrinker->flags = flags | SHRINKER_ALLOCATED; > + shrinker->seeks = DEFAULT_SEEKS; > + > + if (flags & SHRINKER_MEMCG_AWARE) { > + err = prealloc_memcg_shrinker(shrinker); > + if (err == -ENOSYS) > + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; > + else if (err == 0) > + goto done; > + else > + goto err_flags; Actually, the code here is a little confusing me when I fist look at it. I think there could be some improvements here. Something like: if (flags & SHRINKER_MEMCG_AWARE) { err = prealloc_memcg_shrinker(shrinker); if (err == -ENOSYS) { /* Memcg is not supported and fallback to non-memcg-aware shrinker. */ shrinker->flags &= ~SHRINKER_MEMCG_AWARE; goto non-memcg; } if (err) goto err_flags; return shrinker; } non-memcg: [...] return shrinker; In this case, the code becomes more clear (at least for me). We have split the code into two part, one is handling memcg-aware case, another is non-memcg-aware case. Any side will have a explicit "return" keyword to return once succeeds. It is a little implicit that the previous one uses "goto done". And the tag of "non-memcg" is also a good annotation to tell us the following code handles non-memcg-aware case. > + } > + > + /* > + * The nr_deferred is available on per memcg level for memcg aware > + * shrinkers, so only allocate nr_deferred in the following cases: > + * - non memcg aware shrinkers > + * - !CONFIG_MEMCG > + * - memcg is disabled by kernel command line > + */ > + size = sizeof(*shrinker->nr_deferred); > + if (flags & SHRINKER_NUMA_AWARE) > + size *= nr_node_ids; > + > + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); > + if (!shrinker->nr_deferred) > + goto err_flags; > + > +done: > + return shrinker; > + > +err_flags: > + shrinker_debugfs_name_free(shrinker); > +err_name: > + kfree(shrinker); > + return NULL; > +} > +EXPORT_SYMBOL_GPL(shrinker_alloc); > + > +void shrinker_register(struct shrinker *shrinker) > +{ > + if (unlikely(!(shrinker->flags & SHRINKER_ALLOCATED))) { > + pr_warn("Must use shrinker_alloc() to dynamically allocate the shrinker"); > + return; > + } > + > + down_write(&shrinker_rwsem); > + list_add_tail(&shrinker->list, &shrinker_list); > + shrinker->flags |= SHRINKER_REGISTERED; > + shrinker_debugfs_add(shrinker); > + up_write(&shrinker_rwsem); > +} > +EXPORT_SYMBOL_GPL(shrinker_register); > + > +void shrinker_free(struct shrinker *shrinker) > +{ > + struct dentry *debugfs_entry = NULL; > + int debugfs_id; > + > + if (!shrinker) > + return; > + > + down_write(&shrinker_rwsem); > + if (shrinker->flags & SHRINKER_REGISTERED) { > + list_del(&shrinker->list); > + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); > + shrinker->flags &= ~SHRINKER_REGISTERED; > + } else { > + shrinker_debugfs_name_free(shrinker); We could remove shrinker_debugfs_name_free() calling from shrinker_debugfs_detach(), then we could call shrinker_debugfs_name_free() anyway, otherwise, it it a little weird for me. And the srinker name is allocated from shrinker_alloc(), so I think it it reasonable for shrinker_free() to free the shrinker name. Thanks. > + } > + > + if (shrinker->flags & SHRINKER_MEMCG_AWARE) > + unregister_memcg_shrinker(shrinker); > + up_write(&shrinker_rwsem); > + > + if (debugfs_entry) > + shrinker_debugfs_remove(debugfs_entry, debugfs_id); > + > + kfree(shrinker->nr_deferred); > + shrinker->nr_deferred = NULL; > + > + kfree(shrinker); > +} > +EXPORT_SYMBOL_GPL(shrinker_free); > + > /* > * Add a shrinker callback to be called from the vm. > */ > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c > index e4ce509f619e..38452f539f40 100644 > --- a/mm/shrinker_debug.c > +++ b/mm/shrinker_debug.c > @@ -193,6 +193,20 @@ int shrinker_debugfs_add(struct shrinker *shrinker) > return 0; > } > > +int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char *fmt, > + va_list ap) > +{ > + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); > + > + return shrinker->name ? 0 : -ENOMEM; > +} > + > +void shrinker_debugfs_name_free(struct shrinker *shrinker) > +{ > + kfree_const(shrinker->name); > + shrinker->name = NULL; > +} It it better to move both helpers to internal.h and mark them as inline since both are very simple enough. Thanks. > + > int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) > { > struct dentry *entry; > @@ -241,8 +255,7 @@ struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, > > lockdep_assert_held(&shrinker_rwsem); > > - kfree_const(shrinker->name); > - shrinker->name = NULL; > + shrinker_debugfs_name_free(shrinker); > > *debugfs_id = entry ? shrinker->debugfs_id : -1; > shrinker->debugfs_entry = NULL;
On 2023/9/18 17:03, Muchun Song wrote: > > ... >> @@ -95,6 +97,11 @@ struct shrinker { >> * non-MEMCG_AWARE shrinker should not have this flag set. >> */ >> #define SHRINKER_NONSLAB (1 << 3) >> +#define SHRINKER_ALLOCATED (1 << 4) > > It is better to add a comment here to tell users > it is only used by internals of shrinker. The users > are not supposed to pass this flag to shrink APIs. OK, will annotate SHRINKER_REGISTERED and SHRINKER_ALLOCATED as flags used internally. > >> + >> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, >> ...); >> +void shrinker_register(struct shrinker *shrinker); >> +void shrinker_free(struct shrinker *shrinker); >> extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, >> const char *fmt, ...); >> diff --git a/mm/internal.h b/mm/internal.h >> index 0471d6326d01..5587cae20ebf 100644 >> --- a/mm/internal.h >> +++ b/mm/internal.h >> @@ -1161,6 +1161,9 @@ unsigned long shrink_slab(gfp_t gfp_mask, int >> nid, struct mem_cgroup *memcg, >> #ifdef CONFIG_SHRINKER_DEBUG >> extern int shrinker_debugfs_add(struct shrinker *shrinker); >> +extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, >> + const char *fmt, va_list ap); >> +extern void shrinker_debugfs_name_free(struct shrinker *shrinker); >> extern struct dentry *shrinker_debugfs_detach(struct shrinker >> *shrinker, >> int *debugfs_id); >> extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, >> @@ -1170,6 +1173,14 @@ static inline int shrinker_debugfs_add(struct >> shrinker *shrinker) >> { >> return 0; >> } >> +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, >> + const char *fmt, va_list ap) >> +{ >> + return 0; >> +} >> +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) >> +{ >> +} >> static inline struct dentry *shrinker_debugfs_detach(struct shrinker >> *shrinker, >> int *debugfs_id) >> { >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index a16cd448b924..201211a67827 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -550,6 +550,108 @@ unsigned long shrink_slab(gfp_t gfp_mask, int >> nid, struct mem_cgroup *memcg, >> return freed; >> } >> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, >> ...) >> +{ >> + struct shrinker *shrinker; >> + unsigned int size; >> + va_list ap; >> + int err; >> + >> + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); >> + if (!shrinker) >> + return NULL; >> + >> + va_start(ap, fmt); >> + err = shrinker_debugfs_name_alloc(shrinker, fmt, ap); >> + va_end(ap); >> + if (err) >> + goto err_name; >> + >> + shrinker->flags = flags | SHRINKER_ALLOCATED; >> + shrinker->seeks = DEFAULT_SEEKS; >> + >> + if (flags & SHRINKER_MEMCG_AWARE) { >> + err = prealloc_memcg_shrinker(shrinker); >> + if (err == -ENOSYS) >> + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; >> + else if (err == 0) >> + goto done; >> + else >> + goto err_flags; > > Actually, the code here is a little confusing me when I fist look > at it. I think there could be some improvements here. Something > like: > > if (flags & SHRINKER_MEMCG_AWARE) { > err = prealloc_memcg_shrinker(shrinker); > if (err == -ENOSYS) { > /* Memcg is not supported and fallback to > non-memcg-aware shrinker. */ > shrinker->flags &= ~SHRINKER_MEMCG_AWARE; > goto non-memcg; > } > > if (err) > goto err_flags; > return shrinker; > } > > non-memcg: > [...] > return shrinker; > > In this case, the code becomes more clear (at least for me). We have > split the > code into two part, one is handling memcg-aware case, another is > non-memcg-aware > case. Any side will have a explicit "return" keyword to return once > succeeds. > It is a little implicit that the previous one uses "goto done". > > And the tag of "non-memcg" is also a good annotation to tell us the > following > code handles non-memcg-aware case. Make sense, will do. > >> + } >> + >> + /* >> + * The nr_deferred is available on per memcg level for memcg aware >> + * shrinkers, so only allocate nr_deferred in the following cases: >> + * - non memcg aware shrinkers >> + * - !CONFIG_MEMCG >> + * - memcg is disabled by kernel command line >> + */ >> + size = sizeof(*shrinker->nr_deferred); >> + if (flags & SHRINKER_NUMA_AWARE) >> + size *= nr_node_ids; >> + >> + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); >> + if (!shrinker->nr_deferred) >> + goto err_flags; >> + >> +done: >> + return shrinker; >> + >> +err_flags: >> + shrinker_debugfs_name_free(shrinker); >> +err_name: >> + kfree(shrinker); >> + return NULL; >> +} >> +EXPORT_SYMBOL_GPL(shrinker_alloc); >> + >> +void shrinker_register(struct shrinker *shrinker) >> +{ >> + if (unlikely(!(shrinker->flags & SHRINKER_ALLOCATED))) { >> + pr_warn("Must use shrinker_alloc() to dynamically allocate >> the shrinker"); >> + return; >> + } >> + >> + down_write(&shrinker_rwsem); >> + list_add_tail(&shrinker->list, &shrinker_list); >> + shrinker->flags |= SHRINKER_REGISTERED; >> + shrinker_debugfs_add(shrinker); >> + up_write(&shrinker_rwsem); >> +} >> +EXPORT_SYMBOL_GPL(shrinker_register); >> + >> +void shrinker_free(struct shrinker *shrinker) >> +{ >> + struct dentry *debugfs_entry = NULL; >> + int debugfs_id; >> + >> + if (!shrinker) >> + return; >> + >> + down_write(&shrinker_rwsem); >> + if (shrinker->flags & SHRINKER_REGISTERED) { >> + list_del(&shrinker->list); >> + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); >> + shrinker->flags &= ~SHRINKER_REGISTERED; >> + } else { >> + shrinker_debugfs_name_free(shrinker); > > We could remove shrinker_debugfs_name_free() calling from > shrinker_debugfs_detach(), then we could call > shrinker_debugfs_name_free() anyway, otherwise, it it a little > weird for me. And the srinker name is allocated from shrinker_alloc(), > so I think it it reasonable for shrinker_free() to free the > shrinker name. OK, will do. > > Thanks. > >> + } >> + >> + if (shrinker->flags & SHRINKER_MEMCG_AWARE) >> + unregister_memcg_shrinker(shrinker); >> + up_write(&shrinker_rwsem); >> + >> + if (debugfs_entry) >> + shrinker_debugfs_remove(debugfs_entry, debugfs_id); >> + >> + kfree(shrinker->nr_deferred); >> + shrinker->nr_deferred = NULL; >> + >> + kfree(shrinker); >> +} >> +EXPORT_SYMBOL_GPL(shrinker_free); >> + >> /* >> * Add a shrinker callback to be called from the vm. >> */ >> diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c >> index e4ce509f619e..38452f539f40 100644 >> --- a/mm/shrinker_debug.c >> +++ b/mm/shrinker_debug.c >> @@ -193,6 +193,20 @@ int shrinker_debugfs_add(struct shrinker *shrinker) >> return 0; >> } >> +int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char >> *fmt, >> + va_list ap) >> +{ >> + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); >> + >> + return shrinker->name ? 0 : -ENOMEM; >> +} >> + >> +void shrinker_debugfs_name_free(struct shrinker *shrinker) >> +{ >> + kfree_const(shrinker->name); >> + shrinker->name = NULL; >> +} > > It it better to move both helpers to internal.h and mark them as inline > since both are very simple enough. OK, will do. Hi Andrew, below is the cleanup patch, which has a small conflict with [PATCH v6 41/45]: From 5bc2b77484f5cd4616e510158f91c8877bd033ad Mon Sep 17 00:00:00 2001 From: Qi Zheng <zhengqi.arch@bytedance.com> Date: Mon, 18 Sep 2023 10:41:15 +0000 Subject: [PATCH] mm: shrinker: some cleanup Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> --- include/linux/shrinker.h | 14 ++++++++------ mm/internal.h | 17 ++++++++++++++--- mm/shrinker.c | 20 ++++++++++++-------- mm/shrinker_debug.c | 16 ---------------- 4 files changed, 34 insertions(+), 33 deletions(-) diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 3f3fd9974ce5..f4a5249f00b2 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -88,16 +88,18 @@ struct shrinker { }; #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */ -/* Flags */ -#define SHRINKER_REGISTERED (1 << 0) -#define SHRINKER_NUMA_AWARE (1 << 1) -#define SHRINKER_MEMCG_AWARE (1 << 2) +/* Internal flags */ +#define SHRINKER_REGISTERED BIT(0) +#define SHRINKER_ALLOCATED BIT(1) + +/* Flags for users to use */ +#define SHRINKER_NUMA_AWARE BIT(2) +#define SHRINKER_MEMCG_AWARE BIT(3) /* * It just makes sense when the shrinker is also MEMCG_AWARE for now, * non-MEMCG_AWARE shrinker should not have this flag set. */ -#define SHRINKER_NONSLAB (1 << 3) -#define SHRINKER_ALLOCATED (1 << 4) +#define SHRINKER_NONSLAB BIT(4) struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); void shrinker_register(struct shrinker *shrinker); diff --git a/mm/internal.h b/mm/internal.h index b9a116dce28e..0f418a11c7a8 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1161,10 +1161,21 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, int priority); #ifdef CONFIG_SHRINKER_DEBUG +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, + const char *fmt, va_list ap) +{ + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + + return shrinker->name ? 0 : -ENOMEM; +} + +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) +{ + kfree_const(shrinker->name); + shrinker->name = NULL; +} + extern int shrinker_debugfs_add(struct shrinker *shrinker); -extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, - const char *fmt, va_list ap); -extern void shrinker_debugfs_name_free(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, int *debugfs_id); extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, diff --git a/mm/shrinker.c b/mm/shrinker.c index 201211a67827..d1032a4d5684 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -572,18 +572,23 @@ struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) if (flags & SHRINKER_MEMCG_AWARE) { err = prealloc_memcg_shrinker(shrinker); - if (err == -ENOSYS) + if (err == -ENOSYS) { + /* Memcg is not supported, fallback to non-memcg-aware shrinker. */ shrinker->flags &= ~SHRINKER_MEMCG_AWARE; - else if (err == 0) - goto done; - else + goto non_memcg; + } + + if (err) goto err_flags; + + return shrinker; } +non_memcg: /* * The nr_deferred is available on per memcg level for memcg aware * shrinkers, so only allocate nr_deferred in the following cases: - * - non memcg aware shrinkers + * - non-memcg-aware shrinkers * - !CONFIG_MEMCG * - memcg is disabled by kernel command line */ @@ -595,7 +600,6 @@ struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) if (!shrinker->nr_deferred) goto err_flags; -done: return shrinker; err_flags: @@ -634,10 +638,10 @@ void shrinker_free(struct shrinker *shrinker) list_del(&shrinker->list); debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); shrinker->flags &= ~SHRINKER_REGISTERED; - } else { - shrinker_debugfs_name_free(shrinker); } + shrinker_debugfs_name_free(shrinker); + if (shrinker->flags & SHRINKER_MEMCG_AWARE) unregister_memcg_shrinker(shrinker); up_write(&shrinker_rwsem); diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index 38452f539f40..24aebe7c24cc 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -193,20 +193,6 @@ int shrinker_debugfs_add(struct shrinker *shrinker) return 0; } -int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char *fmt, - va_list ap) -{ - shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); - - return shrinker->name ? 0 : -ENOMEM; -} - -void shrinker_debugfs_name_free(struct shrinker *shrinker) -{ - kfree_const(shrinker->name); - shrinker->name = NULL; -} - int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) { struct dentry *entry; @@ -255,8 +241,6 @@ struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, lockdep_assert_held(&shrinker_rwsem); - shrinker_debugfs_name_free(shrinker); - *debugfs_id = entry ? shrinker->debugfs_id : -1; shrinker->debugfs_entry = NULL; -- 2.30.2 > > Thanks. > >> + >> int shrinker_debugfs_rename(struct shrinker *shrinker, const char >> *fmt, ...) >> { >> struct dentry *entry; >> @@ -241,8 +255,7 @@ struct dentry *shrinker_debugfs_detach(struct >> shrinker *shrinker, >> lockdep_assert_held(&shrinker_rwsem); >> - kfree_const(shrinker->name); >> - shrinker->name = NULL; >> + shrinker_debugfs_name_free(shrinker); >> *debugfs_id = entry ? shrinker->debugfs_id : -1; >> shrinker->debugfs_entry = NULL; >
> On Sep 18, 2023, at 20:06, Qi Zheng <zhengqi.arch@bytedance.com> wrote: > > > > On 2023/9/18 17:03, Muchun Song wrote: > > ... > >>> @@ -95,6 +97,11 @@ struct shrinker { >>> * non-MEMCG_AWARE shrinker should not have this flag set. >>> */ >>> #define SHRINKER_NONSLAB (1 << 3) >>> +#define SHRINKER_ALLOCATED (1 << 4) >> It is better to add a comment here to tell users >> it is only used by internals of shrinker. The users >> are not supposed to pass this flag to shrink APIs. > > OK, will annotate SHRINKER_REGISTERED and SHRINKER_ALLOCATED > as flags used internally. > >>> + >>> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); >>> +void shrinker_register(struct shrinker *shrinker); >>> +void shrinker_free(struct shrinker *shrinker); >>> extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, >>> const char *fmt, ...); >>> diff --git a/mm/internal.h b/mm/internal.h >>> index 0471d6326d01..5587cae20ebf 100644 >>> --- a/mm/internal.h >>> +++ b/mm/internal.h >>> @@ -1161,6 +1161,9 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, >>> #ifdef CONFIG_SHRINKER_DEBUG >>> extern int shrinker_debugfs_add(struct shrinker *shrinker); >>> +extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, >>> + const char *fmt, va_list ap); >>> +extern void shrinker_debugfs_name_free(struct shrinker *shrinker); >>> extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, >>> int *debugfs_id); >>> extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, >>> @@ -1170,6 +1173,14 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker) >>> { >>> return 0; >>> } >>> +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, >>> + const char *fmt, va_list ap) >>> +{ >>> + return 0; >>> +} >>> +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) >>> +{ >>> +} >>> static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, >>> int *debugfs_id) >>> { >>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>> index a16cd448b924..201211a67827 100644 >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -550,6 +550,108 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, >>> return freed; >>> } >>> +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) >>> +{ >>> + struct shrinker *shrinker; >>> + unsigned int size; >>> + va_list ap; >>> + int err; >>> + >>> + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); >>> + if (!shrinker) >>> + return NULL; >>> + >>> + va_start(ap, fmt); >>> + err = shrinker_debugfs_name_alloc(shrinker, fmt, ap); >>> + va_end(ap); >>> + if (err) >>> + goto err_name; >>> + >>> + shrinker->flags = flags | SHRINKER_ALLOCATED; >>> + shrinker->seeks = DEFAULT_SEEKS; >>> + >>> + if (flags & SHRINKER_MEMCG_AWARE) { >>> + err = prealloc_memcg_shrinker(shrinker); >>> + if (err == -ENOSYS) >>> + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; >>> + else if (err == 0) >>> + goto done; >>> + else >>> + goto err_flags; >> Actually, the code here is a little confusing me when I fist look >> at it. I think there could be some improvements here. Something >> like: >> if (flags & SHRINKER_MEMCG_AWARE) { >> err = prealloc_memcg_shrinker(shrinker); >> if (err == -ENOSYS) { >> /* Memcg is not supported and fallback to non-memcg-aware shrinker. */ >> shrinker->flags &= ~SHRINKER_MEMCG_AWARE; >> goto non-memcg; >> } >> if (err) >> goto err_flags; >> return shrinker; >> } >> non-memcg: >> [...] >> return shrinker; >> In this case, the code becomes more clear (at least for me). We have split the >> code into two part, one is handling memcg-aware case, another is non-memcg-aware >> case. Any side will have a explicit "return" keyword to return once succeeds. >> It is a little implicit that the previous one uses "goto done". >> And the tag of "non-memcg" is also a good annotation to tell us the following >> code handles non-memcg-aware case. > > Make sense, will do. > >>> + } >>> + >>> + /* >>> + * The nr_deferred is available on per memcg level for memcg aware >>> + * shrinkers, so only allocate nr_deferred in the following cases: >>> + * - non memcg aware shrinkers >>> + * - !CONFIG_MEMCG >>> + * - memcg is disabled by kernel command line >>> + */ >>> + size = sizeof(*shrinker->nr_deferred); >>> + if (flags & SHRINKER_NUMA_AWARE) >>> + size *= nr_node_ids; >>> + >>> + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); >>> + if (!shrinker->nr_deferred) >>> + goto err_flags; >>> + >>> +done: >>> + return shrinker; >>> + >>> +err_flags: >>> + shrinker_debugfs_name_free(shrinker); >>> +err_name: >>> + kfree(shrinker); >>> + return NULL; >>> +} >>> +EXPORT_SYMBOL_GPL(shrinker_alloc); >>> + >>> +void shrinker_register(struct shrinker *shrinker) >>> +{ >>> + if (unlikely(!(shrinker->flags & SHRINKER_ALLOCATED))) { >>> + pr_warn("Must use shrinker_alloc() to dynamically allocate the shrinker"); >>> + return; >>> + } >>> + >>> + down_write(&shrinker_rwsem); >>> + list_add_tail(&shrinker->list, &shrinker_list); >>> + shrinker->flags |= SHRINKER_REGISTERED; >>> + shrinker_debugfs_add(shrinker); >>> + up_write(&shrinker_rwsem); >>> +} >>> +EXPORT_SYMBOL_GPL(shrinker_register); >>> + >>> +void shrinker_free(struct shrinker *shrinker) >>> +{ >>> + struct dentry *debugfs_entry = NULL; >>> + int debugfs_id; >>> + >>> + if (!shrinker) >>> + return; >>> + >>> + down_write(&shrinker_rwsem); >>> + if (shrinker->flags & SHRINKER_REGISTERED) { >>> + list_del(&shrinker->list); >>> + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); >>> + shrinker->flags &= ~SHRINKER_REGISTERED; >>> + } else { >>> + shrinker_debugfs_name_free(shrinker); >> We could remove shrinker_debugfs_name_free() calling from >> shrinker_debugfs_detach(), then we could call >> shrinker_debugfs_name_free() anyway, otherwise, it it a little >> weird for me. And the srinker name is allocated from shrinker_alloc(), >> so I think it it reasonable for shrinker_free() to free the >> shrinker name. > > OK, will do. > >> Thanks. >>> + } >>> + >>> + if (shrinker->flags & SHRINKER_MEMCG_AWARE) >>> + unregister_memcg_shrinker(shrinker); >>> + up_write(&shrinker_rwsem); >>> + >>> + if (debugfs_entry) >>> + shrinker_debugfs_remove(debugfs_entry, debugfs_id); >>> + >>> + kfree(shrinker->nr_deferred); >>> + shrinker->nr_deferred = NULL; >>> + >>> + kfree(shrinker); >>> +} >>> +EXPORT_SYMBOL_GPL(shrinker_free); >>> + >>> /* >>> * Add a shrinker callback to be called from the vm. >>> */ >>> diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c >>> index e4ce509f619e..38452f539f40 100644 >>> --- a/mm/shrinker_debug.c >>> +++ b/mm/shrinker_debug.c >>> @@ -193,6 +193,20 @@ int shrinker_debugfs_add(struct shrinker *shrinker) >>> return 0; >>> } >>> +int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char *fmt, >>> + va_list ap) >>> +{ >>> + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); >>> + >>> + return shrinker->name ? 0 : -ENOMEM; >>> +} >>> + >>> +void shrinker_debugfs_name_free(struct shrinker *shrinker) >>> +{ >>> + kfree_const(shrinker->name); >>> + shrinker->name = NULL; >>> +} >> It it better to move both helpers to internal.h and mark them as inline >> since both are very simple enough. > > OK, will do. > > Hi Andrew, below is the cleanup patch, which has a small conflict > with [PATCH v6 41/45]: > > From 5bc2b77484f5cd4616e510158f91c8877bd033ad Mon Sep 17 00:00:00 2001 > From: Qi Zheng <zhengqi.arch@bytedance.com> > Date: Mon, 18 Sep 2023 10:41:15 +0000 > Subject: [PATCH] mm: shrinker: some cleanup > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Thanks.
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 6b5843c3b827..3f3fd9974ce5 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -70,6 +70,8 @@ struct shrinker { int seeks; /* seeks to recreate an obj */ unsigned flags; + void *private_data; + /* These are for internal use */ struct list_head list; #ifdef CONFIG_MEMCG @@ -95,6 +97,11 @@ struct shrinker { * non-MEMCG_AWARE shrinker should not have this flag set. */ #define SHRINKER_NONSLAB (1 << 3) +#define SHRINKER_ALLOCATED (1 << 4) + +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); +void shrinker_register(struct shrinker *shrinker); +void shrinker_free(struct shrinker *shrinker); extern int __printf(2, 3) prealloc_shrinker(struct shrinker *shrinker, const char *fmt, ...); diff --git a/mm/internal.h b/mm/internal.h index 0471d6326d01..5587cae20ebf 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1161,6 +1161,9 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); +extern int shrinker_debugfs_name_alloc(struct shrinker *shrinker, + const char *fmt, va_list ap); +extern void shrinker_debugfs_name_free(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, int *debugfs_id); extern void shrinker_debugfs_remove(struct dentry *debugfs_entry, @@ -1170,6 +1173,14 @@ static inline int shrinker_debugfs_add(struct shrinker *shrinker) { return 0; } +static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker, + const char *fmt, va_list ap) +{ + return 0; +} +static inline void shrinker_debugfs_name_free(struct shrinker *shrinker) +{ +} static inline struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, int *debugfs_id) { diff --git a/mm/shrinker.c b/mm/shrinker.c index a16cd448b924..201211a67827 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -550,6 +550,108 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, return freed; } +struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...) +{ + struct shrinker *shrinker; + unsigned int size; + va_list ap; + int err; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + + va_start(ap, fmt); + err = shrinker_debugfs_name_alloc(shrinker, fmt, ap); + va_end(ap); + if (err) + goto err_name; + + shrinker->flags = flags | SHRINKER_ALLOCATED; + shrinker->seeks = DEFAULT_SEEKS; + + if (flags & SHRINKER_MEMCG_AWARE) { + err = prealloc_memcg_shrinker(shrinker); + if (err == -ENOSYS) + shrinker->flags &= ~SHRINKER_MEMCG_AWARE; + else if (err == 0) + goto done; + else + goto err_flags; + } + + /* + * The nr_deferred is available on per memcg level for memcg aware + * shrinkers, so only allocate nr_deferred in the following cases: + * - non memcg aware shrinkers + * - !CONFIG_MEMCG + * - memcg is disabled by kernel command line + */ + size = sizeof(*shrinker->nr_deferred); + if (flags & SHRINKER_NUMA_AWARE) + size *= nr_node_ids; + + shrinker->nr_deferred = kzalloc(size, GFP_KERNEL); + if (!shrinker->nr_deferred) + goto err_flags; + +done: + return shrinker; + +err_flags: + shrinker_debugfs_name_free(shrinker); +err_name: + kfree(shrinker); + return NULL; +} +EXPORT_SYMBOL_GPL(shrinker_alloc); + +void shrinker_register(struct shrinker *shrinker) +{ + if (unlikely(!(shrinker->flags & SHRINKER_ALLOCATED))) { + pr_warn("Must use shrinker_alloc() to dynamically allocate the shrinker"); + return; + } + + down_write(&shrinker_rwsem); + list_add_tail(&shrinker->list, &shrinker_list); + shrinker->flags |= SHRINKER_REGISTERED; + shrinker_debugfs_add(shrinker); + up_write(&shrinker_rwsem); +} +EXPORT_SYMBOL_GPL(shrinker_register); + +void shrinker_free(struct shrinker *shrinker) +{ + struct dentry *debugfs_entry = NULL; + int debugfs_id; + + if (!shrinker) + return; + + down_write(&shrinker_rwsem); + if (shrinker->flags & SHRINKER_REGISTERED) { + list_del(&shrinker->list); + debugfs_entry = shrinker_debugfs_detach(shrinker, &debugfs_id); + shrinker->flags &= ~SHRINKER_REGISTERED; + } else { + shrinker_debugfs_name_free(shrinker); + } + + if (shrinker->flags & SHRINKER_MEMCG_AWARE) + unregister_memcg_shrinker(shrinker); + up_write(&shrinker_rwsem); + + if (debugfs_entry) + shrinker_debugfs_remove(debugfs_entry, debugfs_id); + + kfree(shrinker->nr_deferred); + shrinker->nr_deferred = NULL; + + kfree(shrinker); +} +EXPORT_SYMBOL_GPL(shrinker_free); + /* * Add a shrinker callback to be called from the vm. */ diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c index e4ce509f619e..38452f539f40 100644 --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -193,6 +193,20 @@ int shrinker_debugfs_add(struct shrinker *shrinker) return 0; } +int shrinker_debugfs_name_alloc(struct shrinker *shrinker, const char *fmt, + va_list ap) +{ + shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap); + + return shrinker->name ? 0 : -ENOMEM; +} + +void shrinker_debugfs_name_free(struct shrinker *shrinker) +{ + kfree_const(shrinker->name); + shrinker->name = NULL; +} + int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) { struct dentry *entry; @@ -241,8 +255,7 @@ struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, lockdep_assert_held(&shrinker_rwsem); - kfree_const(shrinker->name); - shrinker->name = NULL; + shrinker_debugfs_name_free(shrinker); *debugfs_id = entry ? shrinker->debugfs_id : -1; shrinker->debugfs_entry = NULL;
Currently, the shrinker instances can be divided into the following three types: a) global shrinker instance statically defined in the kernel, such as workingset_shadow_shrinker. b) global shrinker instance statically defined in the kernel modules, such as mmu_shrinker in x86. c) shrinker instance embedded in other structures. For case a, the memory of shrinker instance is never freed. For case b, the memory of shrinker instance will be freed after synchronize_rcu() when the module is unloaded. For case c, the memory of shrinker instance will be freed along with the structure it is embedded in. In preparation for implementing lockless slab shrink, we need to dynamically allocate those shrinker instances in case c, then the memory can be dynamically freed alone by calling kfree_rcu(). So this commit adds the following new APIs for dynamically allocating shrinker, and add a private_data field to struct shrinker to record and get the original embedded structure. 1. shrinker_alloc() Used to allocate shrinker instance itself and related memory, it will return a pointer to the shrinker instance on success and NULL on failure. 2. shrinker_register() Used to register the shrinker instance, which is same as the current register_shrinker_prepared(). 3. shrinker_free() Used to unregister (if needed) and free the shrinker instance. In order to simplify shrinker-related APIs and make shrinker more independent of other kernel mechanisms, subsequent submissions will use the above API to convert all shrinkers (including case a and b) to dynamically allocated, and then remove all existing APIs. This will also have another advantage mentioned by Dave Chinner: ``` The other advantage of this is that it will break all the existing out of tree code and third party modules using the old API and will no longer work with a kernel using lockless slab shrinkers. They need to break (both at the source and binary levels) to stop bad things from happening due to using unconverted shrinkers in the new setup. ``` Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> --- include/linux/shrinker.h | 7 +++ mm/internal.h | 11 +++++ mm/shrinker.c | 102 +++++++++++++++++++++++++++++++++++++++ mm/shrinker_debug.c | 17 ++++++- 4 files changed, 135 insertions(+), 2 deletions(-)