Message ID | 1351931915-1701-2-git-send-email-tj@kernel.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 11/03/2012 09:38 AM, Tejun Heo wrote: > Currently, there's no way for a controller to find out whether a new > cgroup finished all ->create() allocatinos successfully and is > considered "live" by cgroup. > > This becomes a problem later when we add generic descendants walking > to cgroup which can be used by controllers as controllers don't have a > synchronization point where it can synchronize against new cgroups > appearing in such walks. > > This patch adds ->post_create(). It's called after all ->create() > succeeded and the cgroup is linked into the generic cgroup hierarchy. > This plays the counterpart of ->pre_destroy(). > > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: Glauber Costa <glommer@parallels.com> Tejun, If we do it this way, we end up with two callbacks that are called after create: post_clone and post_create. I myself prefer the approach I took, that convert post_clone into post_create, and would prefer if you would pick that up. For me, post_clone is totally a glitch that should not exist. Merging this with post_create gives the following semantics: * A while after cgroup creation, you will get a callback. In that callback, you do whatever initialization you may need that you could not in create. Why is reacting to a flag being set any different? -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat 03-11-12 01:38:27, Tejun Heo wrote: > Currently, there's no way for a controller to find out whether a new > cgroup finished all ->create() allocatinos successfully and is > considered "live" by cgroup. > > This becomes a problem later when we add generic descendants walking > to cgroup which can be used by controllers as controllers don't have a > synchronization point where it can synchronize against new cgroups > appearing in such walks. > > This patch adds ->post_create(). It's called after all ->create() > succeeded and the cgroup is linked into the generic cgroup hierarchy. > This plays the counterpart of ->pre_destroy(). Hmm, I had to look at "cgroup_freezer: implement proper hierarchy support" to actually understand what is the callback good for. The above sounds as if the callback is needed when a controller wants to use the new iterators or when pre_destroy is defined. I think it would be helpful if the changelog described that the callback is needed when the controller keeps a mutable shared state for the hierarchy. For example memory controller doesn't have any such a strict requirement so we can safely use your new iterators without pre_destroy. Anyway, I like this change because the shared state is now really easy to implement. > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> > --- > include/linux/cgroup.h | 1 + > kernel/cgroup.c | 12 ++++++++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h > index fe876a7..b442122 100644 > --- a/include/linux/cgroup.h > +++ b/include/linux/cgroup.h > @@ -438,6 +438,7 @@ int cgroup_taskset_size(struct cgroup_taskset *tset); > > struct cgroup_subsys { > struct cgroup_subsys_state *(*create)(struct cgroup *cgrp); > + void (*post_create)(struct cgroup *cgrp); > void (*pre_destroy)(struct cgroup *cgrp); > void (*destroy)(struct cgroup *cgrp); > int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset); > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index e3045ad..f05d992 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -4060,10 +4060,15 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry, > if (err < 0) > goto err_remove; > > - /* each css holds a ref to the cgroup's dentry */ > - for_each_subsys(root, ss) > + for_each_subsys(root, ss) { > + /* each css holds a ref to the cgroup's dentry */ > dget(dentry); > > + /* creation succeeded, notify subsystems */ > + if (ss->post_create) > + ss->post_create(cgrp); > + } > + > /* The cgroup directory was pre-locked for us */ > BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex)); > > @@ -4281,6 +4286,9 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss) > > ss->active = 1; > > + if (ss->post_create) > + ss->post_create(&ss->root->top_cgroup); > + > /* this function shouldn't be used with modular subsystems, since they > * need to register a subsys_id, among other things */ > BUG_ON(ss->module); > -- > 1.7.11.7 >
Hello, Michal. On Wed, Nov 07, 2012 at 04:25:16PM +0100, Michal Hocko wrote: > > This patch adds ->post_create(). It's called after all ->create() > > succeeded and the cgroup is linked into the generic cgroup hierarchy. > > This plays the counterpart of ->pre_destroy(). > > Hmm, I had to look at "cgroup_freezer: implement proper hierarchy > support" to actually understand what is the callback good for. The above > sounds as if the callback is needed when a controller wants to use > the new iterators or when pre_destroy is defined. > > I think it would be helpful if the changelog described that the callback > is needed when the controller keeps a mutable shared state for the > hierarchy. For example memory controller doesn't have any such a strict > requirement so we can safely use your new iterators without pre_destroy. Hmm.... will try to explain it but I think it might be best to just refer to the later patch for details. It's a bit tricky to explain. Thanks.
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index fe876a7..b442122 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -438,6 +438,7 @@ int cgroup_taskset_size(struct cgroup_taskset *tset); struct cgroup_subsys { struct cgroup_subsys_state *(*create)(struct cgroup *cgrp); + void (*post_create)(struct cgroup *cgrp); void (*pre_destroy)(struct cgroup *cgrp); void (*destroy)(struct cgroup *cgrp); int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index e3045ad..f05d992 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4060,10 +4060,15 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry, if (err < 0) goto err_remove; - /* each css holds a ref to the cgroup's dentry */ - for_each_subsys(root, ss) + for_each_subsys(root, ss) { + /* each css holds a ref to the cgroup's dentry */ dget(dentry); + /* creation succeeded, notify subsystems */ + if (ss->post_create) + ss->post_create(cgrp); + } + /* The cgroup directory was pre-locked for us */ BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex)); @@ -4281,6 +4286,9 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss) ss->active = 1; + if (ss->post_create) + ss->post_create(&ss->root->top_cgroup); + /* this function shouldn't be used with modular subsystems, since they * need to register a subsys_id, among other things */ BUG_ON(ss->module);
Currently, there's no way for a controller to find out whether a new cgroup finished all ->create() allocatinos successfully and is considered "live" by cgroup. This becomes a problem later when we add generic descendants walking to cgroup which can be used by controllers as controllers don't have a synchronization point where it can synchronize against new cgroups appearing in such walks. This patch adds ->post_create(). It's called after all ->create() succeeded and the cgroup is linked into the generic cgroup hierarchy. This plays the counterpart of ->pre_destroy(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Glauber Costa <glommer@parallels.com> --- include/linux/cgroup.h | 1 + kernel/cgroup.c | 12 ++++++++++-- 2 files changed, 11 insertions(+), 2 deletions(-)