Message ID | 20110831102100.GA2828@mtj.dyndns.org (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 08/31, Tejun Heo wrote: > > I'm in the process of moving and can only use a quite old laptop. I > tested compile but couldn't really do much else, so please proceed > with caution. Oleg, can you please ack the patches if you agree with > the updated versions? Everything looks fine. But I am already sleeping now ;) Rafael, Tejun, I'll try to re-read 1-4 tomorrow. I do not expect I'll find something interesting, just I am paranoid. Looks like, 1/4 could have an additional note in the changelog, with this patch we avoid the unnecessary try_to_freeze_cgroup() and this looks like a win to me... Oleg.
On Wed, Aug 31, 2011 at 12:21:07PM +0200, Tejun Heo wrote: > d02f52811d0e "cgroup_freezer: prepare for removal of TIF_FREEZE" moved > setting of freezer->state into freezer_change_state(); unfortunately, > while doing so, when it's beginning to freeze tasks, it sets the state > to CGROUP_FROZEN instead of CGROUP_FREEZING ending up skipping the > whole freezing state. Fix it. > > -v2: Oleg pointed out that re-freezing FROZEN cgroup could increment > system_freezing_cnt. Fixed. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Reported-by: Oleg Nesterov <oleg@redhat.com> > Cc: Paul Menage <paul@paulmenage.org> > Cc: "Rafael J. Wysocki" <rjw@sisk.pl> > --- > I'm in the process of moving and can only use a quite old laptop. I > tested compile but couldn't really do much else, so please proceed > with caution. Oleg, can you please ack the patches if you agree with > the updated versions? > > Thanks. > > kernel/cgroup_freezer.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) > > Index: work/kernel/cgroup_freezer.c > =================================================================== > --- work.orig/kernel/cgroup_freezer.c > +++ work/kernel/cgroup_freezer.c > @@ -308,24 +308,26 @@ static int freezer_change_state(struct c > spin_lock_irq(&freezer->lock); > > update_if_frozen(cgroup, freezer); > - if (goal_state == freezer->state) > - goto out; > - > - freezer->state = goal_state; > > switch (goal_state) { > case CGROUP_THAWED: > - atomic_dec(&system_freezing_cnt); > - unfreeze_cgroup(cgroup, freezer); > + if (freezer->state != CGROUP_THAWED) { > + freezer->state = CGROUP_THAWED; > + atomic_dec(&system_freezing_cnt); > + unfreeze_cgroup(cgroup, freezer); > + } > break; > case CGROUP_FROZEN: > - atomic_inc(&system_freezing_cnt); > - retval = try_to_freeze_cgroup(cgroup, freezer); > + if (freezer->state == CGROUP_THAWED) { > + freezer->state = CGROUP_FREEZING; > + atomic_inc(&system_freezing_cnt); > + retval = try_to_freeze_cgroup(cgroup, freezer); This still doesn't look quite right. If the cgroup is FREEZING it should also call try_to_freeze_cgroup(). I think this is what's needed: if (freezer->state == CGROUP_THAWED) atomic_inc(&system_freezing_cnt); freezer->state = CGROUP_FREEZING; retval = try_to_freeze_cgroup(cgroup, freezer); > + } > break; > default: > BUG(); > } Cheers, -Matt Helsley
Hello, Matt. On Thu, Sep 01, 2011 at 05:42:31PM -0700, Matt Helsley wrote: > > case CGROUP_FROZEN: > > - atomic_inc(&system_freezing_cnt); > > - retval = try_to_freeze_cgroup(cgroup, freezer); > > + if (freezer->state == CGROUP_THAWED) { > > + freezer->state = CGROUP_FREEZING; > > + atomic_inc(&system_freezing_cnt); > > + retval = try_to_freeze_cgroup(cgroup, freezer); > > This still doesn't look quite right. If the cgroup is FREEZING it should > also call try_to_freeze_cgroup(). I think this is what's needed: > > if (freezer->state == CGROUP_THAWED) > atomic_inc(&system_freezing_cnt); > freezer->state = CGROUP_FREEZING; > retval = try_to_freeze_cgroup(cgroup, freezer); Does this make any difference? Tasks can't migrate if the cgroups are freezing and freezing state is inherited through forks. But yeah doing that for both THAWED and FROZEN might still be a good idea for safety. Thanks.
On 09/01, Matt Helsley wrote: > > On Wed, Aug 31, 2011 at 12:21:07PM +0200, Tejun Heo wrote: > > case CGROUP_FROZEN: > > - atomic_inc(&system_freezing_cnt); > > - retval = try_to_freeze_cgroup(cgroup, freezer); > > + if (freezer->state == CGROUP_THAWED) { > > + freezer->state = CGROUP_FREEZING; > > + atomic_inc(&system_freezing_cnt); > > + retval = try_to_freeze_cgroup(cgroup, freezer); > > This still doesn't look quite right. If the cgroup is FREEZING it should > also call try_to_freeze_cgroup(). I think this is what's needed: > > if (freezer->state == CGROUP_THAWED) > atomic_inc(&system_freezing_cnt); > freezer->state = CGROUP_FREEZING; > retval = try_to_freeze_cgroup(cgroup, freezer); This is what I mentioned before, to me this looks like a win. Why do we need try_to_freeze_cgroup() in this case? "for safety" could actually mean "hide the bug" ;) But I agree either way. Rafael, I think 1-4 are fine, but I think we need the simple 5/4, will send in a minute... Oleg.
Hello, On Fri, Sep 02, 2011 at 06:58:39PM +0200, Oleg Nesterov wrote: > > This still doesn't look quite right. If the cgroup is FREEZING it should > > also call try_to_freeze_cgroup(). I think this is what's needed: > > > > if (freezer->state == CGROUP_THAWED) > > atomic_inc(&system_freezing_cnt); > > freezer->state = CGROUP_FREEZING; > > retval = try_to_freeze_cgroup(cgroup, freezer); > > This is what I mentioned before, to me this looks like a win. > > Why do we need try_to_freeze_cgroup() in this case? "for safety" > could actually mean "hide the bug" ;) I guess it depends on the viewpoint. A simple analogy would be using WARN_ON_ONCE() instead of BUG_ON() so that the mode of failure is softer. This change isn't likely to make bugs significantly more difficult to discover so why not? > But I agree either way. Rafael, I think 1-4 are fine, but I think > we need the simple 5/4, will send in a minute... Can you please wait a bit? The second one was broken (missing unlock) and I made some small adjustments. Will repost soon. Thanks.
On 09/03, Tejun Heo wrote: > > Hello, > > On Fri, Sep 02, 2011 at 06:58:39PM +0200, Oleg Nesterov wrote: > > > This still doesn't look quite right. If the cgroup is FREEZING it should > > > also call try_to_freeze_cgroup(). I think this is what's needed: > > > > > > if (freezer->state == CGROUP_THAWED) > > > atomic_inc(&system_freezing_cnt); > > > freezer->state = CGROUP_FREEZING; > > > retval = try_to_freeze_cgroup(cgroup, freezer); > > > > This is what I mentioned before, to me this looks like a win. > > > > Why do we need try_to_freeze_cgroup() in this case? "for safety" > > could actually mean "hide the bug" ;) > > I guess it depends on the viewpoint. A simple analogy would be using > WARN_ON_ONCE() instead of BUG_ON() so that the mode of failure is > softer. This change isn't likely to make bugs significantly more > difficult to discover so why not? I agree either way. Personally I prefer your current patch. Because it is not clear why do we call try_to_freeze_cgroup() if it was already called. And, the 2nd call can silently hide the problem if we have some bug. But of course, this is up to you and Matt. > > But I agree either way. Rafael, I think 1-4 are fine, but I think > > we need the simple 5/4, will send in a minute... > > Can you please wait a bit? The second one was broken (missing unlock) Yes, I just noticed the small problem too, hopefully we mean the same bug ;) Oleg.
Hello, On Fri, Sep 02, 2011 at 07:15:17PM +0200, Oleg Nesterov wrote: > > I guess it depends on the viewpoint. A simple analogy would be using > > WARN_ON_ONCE() instead of BUG_ON() so that the mode of failure is > > softer. This change isn't likely to make bugs significantly more > > difficult to discover so why not? > > I agree either way. > > Personally I prefer your current patch. Because it is not clear why > do we call try_to_freeze_cgroup() if it was already called. And, the > 2nd call can silently hide the problem if we have some bug. > > But of course, this is up to you and Matt. I'm okay either way too. It would make a bug in that area a bit less annoying and thus may decrease the chance of bug report, but it means that we ship with built-in easy work around (if it doesn't work at the first kick, kick again), which can be beneficial in practice. > > > But I agree either way. Rafael, I think 1-4 are fine, but I think > > > we need the simple 5/4, will send in a minute... > > > > Can you please wait a bit? The second one was broken (missing unlock) > > Yes, I just noticed the small problem too, hopefully we mean the same > bug ;) Yeap, the same one. I thought it was the second patch instead of the third. :) Thanks.
On Fri, Sep 02, 2011 at 06:58:39PM +0200, Oleg Nesterov wrote: > On 09/01, Matt Helsley wrote: > > > > On Wed, Aug 31, 2011 at 12:21:07PM +0200, Tejun Heo wrote: > > > case CGROUP_FROZEN: > > > - atomic_inc(&system_freezing_cnt); > > > - retval = try_to_freeze_cgroup(cgroup, freezer); > > > + if (freezer->state == CGROUP_THAWED) { > > > + freezer->state = CGROUP_FREEZING; > > > + atomic_inc(&system_freezing_cnt); > > > + retval = try_to_freeze_cgroup(cgroup, freezer); > > > > This still doesn't look quite right. If the cgroup is FREEZING it should > > also call try_to_freeze_cgroup(). I think this is what's needed: > > > > if (freezer->state == CGROUP_THAWED) > > atomic_inc(&system_freezing_cnt); > > freezer->state = CGROUP_FREEZING; > > retval = try_to_freeze_cgroup(cgroup, freezer); > > This is what I mentioned before, to me this looks like a win. > > Why do we need try_to_freeze_cgroup() in this case? "for safety" > could actually mean "hide the bug" ;) Well, I need to check Tejun's latest freezer bits to see if this is still the case but it was possible to get to the FREEZING state and not enter FROZEN before returning to userspace. So you could come back into the state change function in the FREEZING state with FROZEN as the goal state. Note that for the cgroup freezer the FREEZING state is optional -- so skipping it is fine so long was we guarantee that by the time we exit to userspace with a FROZEN state all the tasks in the cgroup are actually frozen (in the refrigerator loop) or frozen enough (about to enter the refrigerator loop without causing more IO -- e.g. stopped). Cheers, -Matt
Index: work/kernel/cgroup_freezer.c =================================================================== --- work.orig/kernel/cgroup_freezer.c +++ work/kernel/cgroup_freezer.c @@ -308,24 +308,26 @@ static int freezer_change_state(struct c spin_lock_irq(&freezer->lock); update_if_frozen(cgroup, freezer); - if (goal_state == freezer->state) - goto out; - - freezer->state = goal_state; switch (goal_state) { case CGROUP_THAWED: - atomic_dec(&system_freezing_cnt); - unfreeze_cgroup(cgroup, freezer); + if (freezer->state != CGROUP_THAWED) { + freezer->state = CGROUP_THAWED; + atomic_dec(&system_freezing_cnt); + unfreeze_cgroup(cgroup, freezer); + } break; case CGROUP_FROZEN: - atomic_inc(&system_freezing_cnt); - retval = try_to_freeze_cgroup(cgroup, freezer); + if (freezer->state == CGROUP_THAWED) { + freezer->state = CGROUP_FREEZING; + atomic_inc(&system_freezing_cnt); + retval = try_to_freeze_cgroup(cgroup, freezer); + } break; default: BUG(); } -out: + spin_unlock_irq(&freezer->lock); return retval;
d02f52811d0e "cgroup_freezer: prepare for removal of TIF_FREEZE" moved setting of freezer->state into freezer_change_state(); unfortunately, while doing so, when it's beginning to freeze tasks, it sets the state to CGROUP_FROZEN instead of CGROUP_FREEZING ending up skipping the whole freezing state. Fix it. -v2: Oleg pointed out that re-freezing FROZEN cgroup could increment system_freezing_cnt. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Oleg Nesterov <oleg@redhat.com> Cc: Paul Menage <paul@paulmenage.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> --- I'm in the process of moving and can only use a quite old laptop. I tested compile but couldn't really do much else, so please proceed with caution. Oleg, can you please ack the patches if you agree with the updated versions? Thanks. kernel/cgroup_freezer.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)