Message ID | alpine.LSU.2.11.2007302011450.2347@eggly.anvils (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [mmotm] mm: memcontrol: decouple reference counting from page accounting fix | expand |
On Thu, Jul 30, 2020 at 08:17:50PM -0700, Hugh Dickins wrote: > Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3, > while swapping, crashes soon on mmotm (and so presumably on linux-next): > for example, spinlock found corrupted when lock_page_memcg() is called. > It's as if the mem cgroup structures have been freed too early. > > Stab in the dark: what if all the accounting is right, except that the > css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant? > Removing it fixes the crashes, but that's hardly surprising; and stats > temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free() > showed that mem cgroups were not being leaked with this change. > > Note: this removes the last call to css_put_many() from the tree; and > mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch > removes the last call to css_get_many(): now that their last references > have gone, I expect them soon to be freed from include/linux/cgroup.h. > > Signed-off-by: Hugh Dickins <hughd@google.com> Thanks, Hugh. This fix looks correct to me. And I'd agree with the put being worse than redundant. Its counterpart in try_charge() has been removed, so this a clear-cut ref imbalance. When moving a task between cgroups, we scan the page tables for pages and swap entries, and then pre-charge the target group while we're still allowed to veto the task move (can_attach). In the actual attach step we then reassign all the pages and swap entries and balance the books in the cgroup the task emigrated from. That precharging used to acquire css references for every page charge and swap entry charge when calling try_charge(). That is gone. Now we move css references along with the page (move_account), and swap entries use the mem_cgroup_id references which pin the css indirectly. Leaving that css_put_many behind in the swap path was an oversight. Acked-by: Johannes Weiner <hannes@cmpxchg.org> > --- > Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch > > mm/memcontrol.c | 2 -- > 1 file changed, 2 deletions(-) > > --- mmotm/mm/memcontrol.c 2020-07-27 18:55:00.700554752 -0700 > +++ linux/mm/memcontrol.c 2020-07-30 12:05:00.640091618 -0700 > @@ -5887,8 +5887,6 @@ static void __mem_cgroup_clear_mc(void) > if (!mem_cgroup_is_root(mc.to)) > page_counter_uncharge(&mc.to->memory, mc.moved_swap); > > - css_put_many(&mc.to->css, mc.moved_swap); > - > mc.moved_swap = 0; > } > memcg_oom_recover(from);
On Thu, Jul 30, 2020 at 08:17:50PM -0700, Hugh Dickins wrote: > Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3, > while swapping, crashes soon on mmotm (and so presumably on linux-next): > for example, spinlock found corrupted when lock_page_memcg() is called. > It's as if the mem cgroup structures have been freed too early. > > Stab in the dark: what if all the accounting is right, except that the > css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant? > Removing it fixes the crashes, but that's hardly surprising; and stats > temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free() > showed that mem cgroups were not being leaked with this change. > > Note: this removes the last call to css_put_many() from the tree; and > mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch > removes the last call to css_get_many(): now that their last references > have gone, I expect them soon to be freed from include/linux/cgroup.h. > > Signed-off-by: Hugh Dickins <hughd@google.com> > --- > Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch > > mm/memcontrol.c | 2 -- > 1 file changed, 2 deletions(-) > > --- mmotm/mm/memcontrol.c 2020-07-27 18:55:00.700554752 -0700 > +++ linux/mm/memcontrol.c 2020-07-30 12:05:00.640091618 -0700 > @@ -5887,8 +5887,6 @@ static void __mem_cgroup_clear_mc(void) > if (!mem_cgroup_is_root(mc.to)) > page_counter_uncharge(&mc.to->memory, mc.moved_swap); > > - css_put_many(&mc.to->css, mc.moved_swap); > - > mc.moved_swap = 0; > } > memcg_oom_recover(from); Acked-by: Roman Gushchin <guro@fb.com> Good catch! Thank you, Hugh!
--- mmotm/mm/memcontrol.c 2020-07-27 18:55:00.700554752 -0700 +++ linux/mm/memcontrol.c 2020-07-30 12:05:00.640091618 -0700 @@ -5887,8 +5887,6 @@ static void __mem_cgroup_clear_mc(void) if (!mem_cgroup_is_root(mc.to)) page_counter_uncharge(&mc.to->memory, mc.moved_swap); - css_put_many(&mc.to->css, mc.moved_swap); - mc.moved_swap = 0; } memcg_oom_recover(from);
Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3, while swapping, crashes soon on mmotm (and so presumably on linux-next): for example, spinlock found corrupted when lock_page_memcg() is called. It's as if the mem cgroup structures have been freed too early. Stab in the dark: what if all the accounting is right, except that the css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant? Removing it fixes the crashes, but that's hardly surprising; and stats temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free() showed that mem cgroups were not being leaked with this change. Note: this removes the last call to css_put_many() from the tree; and mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch removes the last call to css_get_many(): now that their last references have gone, I expect them soon to be freed from include/linux/cgroup.h. Signed-off-by: Hugh Dickins <hughd@google.com> --- Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch mm/memcontrol.c | 2 -- 1 file changed, 2 deletions(-)