Message ID | 20220709154457.57379-2-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | bpf: Minor fixes for non-preallocated memory | expand |
On Sat, Jul 9, 2022 at 8:45 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > GFP_ATOMIC doesn't cooperate well with memcg pressure so far, especially > if we allocate too much GFP_ATOMIC memory. For example, when we set the > memcg limit to limit a non-preallocated bpf memory, the GFP_ATOMIC can > easily break the memcg limit by force charge. So it is very dangerous to > use GFP_ATOMIC in non-preallocated case. One way to make it safe is to > remove __GFP_HIGH from GFP_ATOMIC, IOW, use (__GFP_ATOMIC | > __GFP_KSWAPD_RECLAIM) instead, then it will be limited if we allocate > too much memory. There's a plan to completely remove __GFP_ATOMIC in the > mm side[1], so let's use GFP_NOWAIT instead. > > We introduced BPF_F_NO_PREALLOC is because full map pre-allocation is > too memory expensive for some cases. That means removing __GFP_HIGH > doesn't break the rule of BPF_F_NO_PREALLOC, but has the same goal with > it-avoiding issues caused by too much memory. So let's remove it. > > This fix can also apply to other run-time allocations, for example, the > allocation in lpm trie, local storage and devmap. So let fix it > consistently over the bpf code > > It also fixes a typo in the comment. > > [1]. https://lore.kernel.org/linux-mm/163712397076.13692.4727608274002939094@noble.neil.brown.name/ > > Cc: Roman Gushchin <roman.gushchin@linux.dev> > Cc: Shakeel Butt <shakeelb@google.com> > Cc: NeilBrown <neilb@suse.de> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Shakeel Butt <shakeelb@google.com>
On Mon, Jul 11, 2022 at 12:19 PM Shakeel Butt <shakeelb@google.com> wrote: > > On Sat, Jul 9, 2022 at 8:45 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > GFP_ATOMIC doesn't cooperate well with memcg pressure so far, especially > > if we allocate too much GFP_ATOMIC memory. For example, when we set the > > memcg limit to limit a non-preallocated bpf memory, the GFP_ATOMIC can > > easily break the memcg limit by force charge. So it is very dangerous to > > use GFP_ATOMIC in non-preallocated case. One way to make it safe is to > > remove __GFP_HIGH from GFP_ATOMIC, IOW, use (__GFP_ATOMIC | > > __GFP_KSWAPD_RECLAIM) instead, then it will be limited if we allocate > > too much memory. There's a plan to completely remove __GFP_ATOMIC in the > > mm side[1], so let's use GFP_NOWAIT instead. > > > > We introduced BPF_F_NO_PREALLOC is because full map pre-allocation is > > too memory expensive for some cases. That means removing __GFP_HIGH > > doesn't break the rule of BPF_F_NO_PREALLOC, but has the same goal with > > it-avoiding issues caused by too much memory. So let's remove it. > > > > This fix can also apply to other run-time allocations, for example, the > > allocation in lpm trie, local storage and devmap. So let fix it > > consistently over the bpf code > > > > It also fixes a typo in the comment. > > > > [1]. https://lore.kernel.org/linux-mm/163712397076.13692.4727608274002939094@noble.neil.brown.name/ > > > > Cc: Roman Gushchin <roman.gushchin@linux.dev> > > Cc: Shakeel Butt <shakeelb@google.com> > > Cc: NeilBrown <neilb@suse.de> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > Reviewed-by: Shakeel Butt <shakeelb@google.com> Applied to bpf-next.
On Tue, Jul 12, 2022 at 05:49:24PM -0700, Alexei Starovoitov wrote: > On Mon, Jul 11, 2022 at 12:19 PM Shakeel Butt <shakeelb@google.com> wrote: > > > > On Sat, Jul 9, 2022 at 8:45 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > GFP_ATOMIC doesn't cooperate well with memcg pressure so far, especially > > > if we allocate too much GFP_ATOMIC memory. For example, when we set the > > > memcg limit to limit a non-preallocated bpf memory, the GFP_ATOMIC can > > > easily break the memcg limit by force charge. So it is very dangerous to > > > use GFP_ATOMIC in non-preallocated case. One way to make it safe is to > > > remove __GFP_HIGH from GFP_ATOMIC, IOW, use (__GFP_ATOMIC | > > > __GFP_KSWAPD_RECLAIM) instead, then it will be limited if we allocate > > > too much memory. There's a plan to completely remove __GFP_ATOMIC in the > > > mm side[1], so let's use GFP_NOWAIT instead. > > > > > > We introduced BPF_F_NO_PREALLOC is because full map pre-allocation is > > > too memory expensive for some cases. That means removing __GFP_HIGH > > > doesn't break the rule of BPF_F_NO_PREALLOC, but has the same goal with > > > it-avoiding issues caused by too much memory. So let's remove it. > > > > > > This fix can also apply to other run-time allocations, for example, the > > > allocation in lpm trie, local storage and devmap. So let fix it > > > consistently over the bpf code > > > > > > It also fixes a typo in the comment. > > > > > > [1]. https://lore.kernel.org/linux-mm/163712397076.13692.4727608274002939094@noble.neil.brown.name/ > > > > > > Cc: Roman Gushchin <roman.gushchin@linux.dev> > > > Cc: Shakeel Butt <shakeelb@google.com> > > > Cc: NeilBrown <neilb@suse.de> > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > > > Reviewed-by: Shakeel Butt <shakeelb@google.com> > > Applied to bpf-next. Looks like I'm a bit late to the party, but my ack still applies. Thanks!
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index c2867068e5bd..1400561efb15 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -845,7 +845,7 @@ static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net, struct bpf_dtab_netdev *dev; dev = bpf_map_kmalloc_node(&dtab->map, sizeof(*dev), - GFP_ATOMIC | __GFP_NOWARN, + GFP_NOWAIT | __GFP_NOWARN, dtab->map.numa_node); if (!dev) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 17fb69c0e0dc..da7578426a46 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -61,7 +61,7 @@ * * As regular device interrupt handlers and soft interrupts are forced into * thread context, the existing code which does - * spin_lock*(); alloc(GPF_ATOMIC); spin_unlock*(); + * spin_lock*(); alloc(GFP_ATOMIC); spin_unlock*(); * just works. * * In theory the BPF locks could be converted to regular spinlocks as well, @@ -978,7 +978,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, goto dec_count; } l_new = bpf_map_kmalloc_node(&htab->map, htab->elem_size, - GFP_ATOMIC | __GFP_NOWARN, + GFP_NOWAIT | __GFP_NOWARN, htab->map.numa_node); if (!l_new) { l_new = ERR_PTR(-ENOMEM); @@ -996,7 +996,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, } else { /* alloc_percpu zero-fills */ pptr = bpf_map_alloc_percpu(&htab->map, size, 8, - GFP_ATOMIC | __GFP_NOWARN); + GFP_NOWAIT | __GFP_NOWARN); if (!pptr) { kfree(l_new); l_new = ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 8654fc97f5fe..49ef0ce040c7 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -165,7 +165,7 @@ static int cgroup_storage_update_elem(struct bpf_map *map, void *key, } new = bpf_map_kmalloc_node(map, struct_size(new, data, map->value_size), - __GFP_ZERO | GFP_ATOMIC | __GFP_NOWARN, + __GFP_ZERO | GFP_NOWAIT | __GFP_NOWARN, map->numa_node); if (!new) return -ENOMEM; diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index f0d05a3cc4b9..d789e3b831ad 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -285,7 +285,7 @@ static struct lpm_trie_node *lpm_trie_node_alloc(const struct lpm_trie *trie, if (value) size += trie->map.value_size; - node = bpf_map_kmalloc_node(&trie->map, size, GFP_ATOMIC | __GFP_NOWARN, + node = bpf_map_kmalloc_node(&trie->map, size, GFP_NOWAIT | __GFP_NOWARN, trie->map.numa_node); if (!node) return NULL;
GFP_ATOMIC doesn't cooperate well with memcg pressure so far, especially if we allocate too much GFP_ATOMIC memory. For example, when we set the memcg limit to limit a non-preallocated bpf memory, the GFP_ATOMIC can easily break the memcg limit by force charge. So it is very dangerous to use GFP_ATOMIC in non-preallocated case. One way to make it safe is to remove __GFP_HIGH from GFP_ATOMIC, IOW, use (__GFP_ATOMIC | __GFP_KSWAPD_RECLAIM) instead, then it will be limited if we allocate too much memory. There's a plan to completely remove __GFP_ATOMIC in the mm side[1], so let's use GFP_NOWAIT instead. We introduced BPF_F_NO_PREALLOC is because full map pre-allocation is too memory expensive for some cases. That means removing __GFP_HIGH doesn't break the rule of BPF_F_NO_PREALLOC, but has the same goal with it-avoiding issues caused by too much memory. So let's remove it. This fix can also apply to other run-time allocations, for example, the allocation in lpm trie, local storage and devmap. So let fix it consistently over the bpf code It also fixes a typo in the comment. [1]. https://lore.kernel.org/linux-mm/163712397076.13692.4727608274002939094@noble.neil.brown.name/ Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> --- kernel/bpf/devmap.c | 2 +- kernel/bpf/hashtab.c | 6 +++--- kernel/bpf/local_storage.c | 2 +- kernel/bpf/lpm_trie.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-)