Message ID | 20230814070923.35769-1-wuyun.abel@bytedance.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v2] net-memcg: Fix scope of sockmem pressure indicators | expand |
On Mon, Aug 14, 2023 at 12:09 AM Abel Wu <wuyun.abel@bytedance.com> wrote: > > Now there are two indicators of socket memory pressure sit inside > struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating > memory reclaim pressure in memcg->memory and ->tcpmem respectively. > > When in legacy mode (cgroupv1), the socket memory is charged into > ->tcpmem which is independent of ->memory, so socket_pressure has > nothing to do with socket's pressure at all. Things could be worse > by taking socket_pressure into consideration in legacy mode, as a > pressure in ->memory can lead to premature reclamation/throttling > in socket. > > While for the default mode (cgroupv2), the socket memory is charged > into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used. > > So {socket,tcpmem}_pressure are only used in default/legacy mode > respectively for indicating socket memory pressure. This patch fixes > the pieces of code that make mixed use of both. > > Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure") > Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> So, this is undoing the unintended exposure of v2 functionality for the v1. I wonder if someone might have started depending upon that behavior but I am more convinced that no one is using v1's tcpmem accounting due to performance impact. So, this looks good to me. Acked-by: Shakeel Butt <shakeelb@google.com> I do think we should start the deprecation process of v1's tcpmem accounting.
On 8/15/23 4:18 AM, Shakeel Butt wrote: > On Mon, Aug 14, 2023 at 12:09 AM Abel Wu <wuyun.abel@bytedance.com> wrote: >> >> Now there are two indicators of socket memory pressure sit inside >> struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating >> memory reclaim pressure in memcg->memory and ->tcpmem respectively. >> >> When in legacy mode (cgroupv1), the socket memory is charged into >> ->tcpmem which is independent of ->memory, so socket_pressure has >> nothing to do with socket's pressure at all. Things could be worse >> by taking socket_pressure into consideration in legacy mode, as a >> pressure in ->memory can lead to premature reclamation/throttling >> in socket. >> >> While for the default mode (cgroupv2), the socket memory is charged >> into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used. >> >> So {socket,tcpmem}_pressure are only used in default/legacy mode >> respectively for indicating socket memory pressure. This patch fixes >> the pieces of code that make mixed use of both. >> >> Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure") >> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> > > So, this is undoing the unintended exposure of v2 functionality for Exactly. > the v1. I wonder if someone might have started depending upon that > behavior but I am more convinced that no one is using v1's tcpmem > accounting due to performance impact. So, this looks good to me. Agreed. The performance impact is not negligible. While not accounting tcpmem is also undesired for Resource Manager to do provision properly. So we have to migrate to cgroupv2, and now we encountered a new issue. Some discussion with Roman can be found here: https://lore.kernel.org/netdev/29de901f-ae4c-a900-a553-17ec4f096f0e@bytedance.com/ It would be great if you can shed some light on this! > > Acked-by: Shakeel Butt <shakeelb@google.com> Thanks! Abel > > I do think we should start the deprecation process of v1's tcpmem accounting.
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5818af8eca5a..dbf26bc89dd4 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -284,6 +284,11 @@ struct mem_cgroup { atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; atomic_long_t memory_events_local[MEMCG_NR_MEMORY_EVENTS]; + /* + * Hint of reclaim pressure for socket memroy management. Note + * that this indicator should NOT be used in legacy cgroup mode + * where socket memory is accounted/charged separately. + */ unsigned long socket_pressure; /* Legacy tcp memory accounting */ @@ -1727,8 +1732,8 @@ void mem_cgroup_sk_alloc(struct sock *sk); void mem_cgroup_sk_free(struct sock *sk); static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) { - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) - return true; + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) + return !!memcg->tcpmem_pressure; do { if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) return true; diff --git a/mm/vmpressure.c b/mm/vmpressure.c index b52644771cc4..22c6689d9302 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -244,6 +244,14 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, if (mem_cgroup_disabled()) return; + /* + * The in-kernel users only care about the reclaim efficiency + * for this @memcg rather than the whole subtree, and there + * isn't and won't be any in-kernel user in a legacy cgroup. + */ + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !tree) + return; + vmpr = memcg_to_vmpressure(memcg); /*
Now there are two indicators of socket memory pressure sit inside struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating memory reclaim pressure in memcg->memory and ->tcpmem respectively. When in legacy mode (cgroupv1), the socket memory is charged into ->tcpmem which is independent of ->memory, so socket_pressure has nothing to do with socket's pressure at all. Things could be worse by taking socket_pressure into consideration in legacy mode, as a pressure in ->memory can lead to premature reclamation/throttling in socket. While for the default mode (cgroupv2), the socket memory is charged into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used. So {socket,tcpmem}_pressure are only used in default/legacy mode respectively for indicating socket memory pressure. This patch fixes the pieces of code that make mixed use of both. Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure") Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> --- v2: - Add a comment to clarify the scope of indicators (Roman) - Drop code cleanup on tcpmem_pressure (Roman) - Fix the lack of a Fixes: tag (Eric) --- include/linux/memcontrol.h | 9 +++++++-- mm/vmpressure.c | 8 ++++++++ 2 files changed, 15 insertions(+), 2 deletions(-)