Message ID | 20190920122907.GG2507@uranus.lan (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, memcg: assign shrinker_map before kvfree | expand |
On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote: > Currently there is a small gap between fetching pointer, calling > kvfree and assign its value to nil. In current callgraph it is > not a problem (since memcg_free_shrinker_maps is running from > memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still > this looks suspicious and we can easily eliminate the gap at all. With this logic it will still look suspicious since you don't wait a grace period before freeing the map.
On Fri, Sep 20, 2019 at 04:21:14PM +0300, Kirill A. Shutemov wrote: > On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote: > > Currently there is a small gap between fetching pointer, calling > > kvfree and assign its value to nil. In current callgraph it is > > not a problem (since memcg_free_shrinker_maps is running from > > memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still > > this looks suspicious and we can easily eliminate the gap at all. > > With this logic it will still look suspicious since you don't wait > a grace period before freeing the map. Probably, but as far as I see we're using mutex here to order requests. I'm not sure, maybe ktkhai@ made the code to use free before the assign intentionally? As I said there is no bug it the code right now just forced me to stop and reread it several times due to this gap. If you look into other code places where we use similar technique we always assign before free.
On 20.09.2019 15:29, Cyrill Gorcunov wrote: > Currently there is a small gap between fetching pointer, calling > kvfree and assign its value to nil. In current callgraph it is > not a problem (since memcg_free_shrinker_maps is running from > memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still > this looks suspicious and we can easily eliminate the gap at all. > > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com> > Cc: Kirill Tkhai <ktkhai@virtuozzo.com> > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-tip.git/mm/memcontrol.c > =================================================================== > --- linux-tip.git.orig/mm/memcontrol.c > +++ linux-tip.git/mm/memcontrol.c > @@ -364,9 +364,9 @@ static void memcg_free_shrinker_maps(str > for_each_node(nid) { > pn = mem_cgroup_nodeinfo(memcg, nid); > map = rcu_dereference_protected(pn->shrinker_map, true); > + rcu_assign_pointer(pn->shrinker_map, NULL); > if (map) > kvfree(map); > - rcu_assign_pointer(pn->shrinker_map, NULL); > } > } The current scheme is following. We allocate shrinker_map in css_online, while normal freeing happens in css_free. The NULLifying of pointer is needed in case of "abnormal freeing", when memcg_free_shrinker_maps() is called from memcg_alloc_shrinker_maps(). The NULLifying guarantees, we won't free pn->shrinker_map twice. There are no races or problems with kvfree() and rcu_assign_pointer() order, because of nobody can reference shrinker_map before memcg is online. In case of this rcu_assign_pointer() confuses, we may just remove is from the function, and call it only on css_free. Something like the below: diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1c4c08b45e44..454303c3aa0f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -372,7 +372,6 @@ static void memcg_free_shrinker_maps(struct mem_cgroup *memcg) map = rcu_dereference_protected(pn->shrinker_map, true); if (map) kvfree(map); - rcu_assign_pointer(pn->shrinker_map, NULL); } } @@ -389,7 +388,6 @@ static int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg) for_each_node(nid) { map = kvzalloc(sizeof(*map) + size, GFP_KERNEL); if (!map) { - memcg_free_shrinker_maps(memcg); ret = -ENOMEM; break; }
On 20.09.2019 16:21, Kirill A. Shutemov wrote: > On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote: >> Currently there is a small gap between fetching pointer, calling >> kvfree and assign its value to nil. In current callgraph it is >> not a problem (since memcg_free_shrinker_maps is running from >> memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still >> this looks suspicious and we can easily eliminate the gap at all. > > With this logic it will still look suspicious since you don't wait a grace > period before freeing the map. This freeing occurs in the moment, when nobody can dereference shrinker_map in parallel: memcg is either not yet online or its css->refcnt is already dead. This NULLifying is needed just to prevent double freeing of shrinker_map. Please, see the explanation in my email to our namesake. Kirill
On Fri, Sep 20, 2019 at 05:11:00PM +0300, Kirill Tkhai wrote: > > The current scheme is following. We allocate shrinker_map in css_online, > while normal freeing happens in css_free. The NULLifying of pointer is needed > in case of "abnormal freeing", when memcg_free_shrinker_maps() is called > from memcg_alloc_shrinker_maps(). The NULLifying guarantees, we won't free > pn->shrinker_map twice. > > There are no races or problems with kvfree() and rcu_assign_pointer() order, > because of nobody can reference shrinker_map before memcg is online. > > In case of this rcu_assign_pointer() confuses, we may just remove is from > the function, and call it only on css_free. Something like the below: Kirill, I know that there is no problem now (as I pointed in changelog), simply a regular pattern of free after assign is being reversed, which made me nervious. Anyway dropping assigns doesn't help much from my pov so lets leave it as is. The good point is that we've this conversation and if someone get a bit confused in future the google will reveal this text. Which is enough I think.
Index: linux-tip.git/mm/memcontrol.c =================================================================== --- linux-tip.git.orig/mm/memcontrol.c +++ linux-tip.git/mm/memcontrol.c @@ -364,9 +364,9 @@ static void memcg_free_shrinker_maps(str for_each_node(nid) { pn = mem_cgroup_nodeinfo(memcg, nid); map = rcu_dereference_protected(pn->shrinker_map, true); + rcu_assign_pointer(pn->shrinker_map, NULL); if (map) kvfree(map); - rcu_assign_pointer(pn->shrinker_map, NULL); } }
Currently there is a small gap between fetching pointer, calling kvfree and assign its value to nil. In current callgraph it is not a problem (since memcg_free_shrinker_maps is running from memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still this looks suspicious and we can easily eliminate the gap at all. Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)