Message ID | 20230205065805.19598-1-laoar.shao@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf, mm: introduce cgroup.memory=nobpf | expand |
On Sun, Feb 05, 2023 at 06:58:00AM +0000, Yafang Shao wrote: > The bpf memory accouting has some known problems in contianer > environment, > > - The container memory usage is not consistent if there's pinned bpf > program > After the container restart, the leftover bpf programs won't account > to the new generation, so the memory usage of the container is not > consistent. This issue can be resolved by introducing selectable > memcg, but we don't have an agreement on the solution yet. See also > the discussions at https://lwn.net/Articles/905150/ . > > - The leftover non-preallocated bpf map can't be limited > The leftover bpf map will be reparented, and thus it will be limited by > the parent, rather than the container itself. Furthermore, if the > parent is destroyed, it be will limited by its parent's parent, and so > on. It can also be resolved by introducing selectable memcg. > > - The memory dynamically allocated in bpf prog is charged into root memcg > only > Nowdays the bpf prog can dynamically allocate memory, for example via > bpf_obj_new(), but it only allocate from the global bpf_mem_alloc > pool, so it will charge into root memcg only. That needs to be > addressed by a new proposal. > > So let's give the user an option to disable bpf memory accouting. > > The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. I'm not the most familiar with bpf internals, but the memcg bits and adding the boot time flag look good to me: Acked-by: Johannes Weiner <hannes@cmpxchg.org>
On Sun, Feb 05, 2023 at 06:58:00AM +0000, Yafang Shao wrote: > The bpf memory accouting has some known problems in contianer > environment, > > - The container memory usage is not consistent if there's pinned bpf > program > After the container restart, the leftover bpf programs won't account > to the new generation, so the memory usage of the container is not > consistent. This issue can be resolved by introducing selectable > memcg, but we don't have an agreement on the solution yet. See also > the discussions at https://lwn.net/Articles/905150/ . > > - The leftover non-preallocated bpf map can't be limited > The leftover bpf map will be reparented, and thus it will be limited by > the parent, rather than the container itself. Furthermore, if the > parent is destroyed, it be will limited by its parent's parent, and so > on. It can also be resolved by introducing selectable memcg. > > - The memory dynamically allocated in bpf prog is charged into root memcg > only > Nowdays the bpf prog can dynamically allocate memory, for example via > bpf_obj_new(), but it only allocate from the global bpf_mem_alloc > pool, so it will charge into root memcg only. That needs to be > addressed by a new proposal. > > So let's give the user an option to disable bpf memory accouting. > > The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. > > [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@slm.duckdns.org/ > > Yafang Shao (5): > mm: memcontrol: add new kernel parameter cgroup.memory=nobpf > bpf: use bpf_map_kvcalloc in bpf_local_storage > bpf: introduce bpf_memcg_flags() > bpf: allow to disable bpf map memory accounting > bpf: allow to disable bpf prog memory accounting Hello Yafang! Overall the patch looks good to me, please, feel free to add Acked-by: Roman Gushchin <roman.gushchin@linux.dev> I'd squash patch (3) into (4), but up to you. Thanks!
On Thu, Feb 9, 2023 at 4:54 AM Roman Gushchin <roman.gushchin@linux.dev> wrote: > > On Sun, Feb 05, 2023 at 06:58:00AM +0000, Yafang Shao wrote: > > The bpf memory accouting has some known problems in contianer > > environment, > > > > - The container memory usage is not consistent if there's pinned bpf > > program > > After the container restart, the leftover bpf programs won't account > > to the new generation, so the memory usage of the container is not > > consistent. This issue can be resolved by introducing selectable > > memcg, but we don't have an agreement on the solution yet. See also > > the discussions at https://lwn.net/Articles/905150/ . > > > > - The leftover non-preallocated bpf map can't be limited > > The leftover bpf map will be reparented, and thus it will be limited by > > the parent, rather than the container itself. Furthermore, if the > > parent is destroyed, it be will limited by its parent's parent, and so > > on. It can also be resolved by introducing selectable memcg. > > > > - The memory dynamically allocated in bpf prog is charged into root memcg > > only > > Nowdays the bpf prog can dynamically allocate memory, for example via > > bpf_obj_new(), but it only allocate from the global bpf_mem_alloc > > pool, so it will charge into root memcg only. That needs to be > > addressed by a new proposal. > > > > So let's give the user an option to disable bpf memory accouting. > > > > The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. > > > > [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@slm.duckdns.org/ > > > > Yafang Shao (5): > > mm: memcontrol: add new kernel parameter cgroup.memory=nobpf > > bpf: use bpf_map_kvcalloc in bpf_local_storage > > bpf: introduce bpf_memcg_flags() > > bpf: allow to disable bpf map memory accounting > > bpf: allow to disable bpf prog memory accounting > > Hello Yafang! > > Overall the patch looks good to me, please, feel free to add > Acked-by: Roman Gushchin <roman.gushchin@linux.dev> > > I'd squash patch (3) into (4), but up to you. > Sure. Will do it.