Message ID | 20220201183514.3235495-1-zenczykowski@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] RFC: bpf hashmap - improve iteration in the presence of concurrent delete operations | expand |
On Tue, Feb 01, 2022 at 10:35:14AM -0800, Maciej Żenczykowski wrote: > From: Maciej Żenczykowski <maze@google.com> > > by resuming from the bucket the key would be in if it still existed > > Note: AFAICT this an API change that would require some sort of guard... > > bpf map iteration was added all the way back in v3.18-rc4-939-g0f8e4bd8a1fc > but behaviour of find first key with NULL was only done in v4.11-rc7-2042-g8fe45924387b > (before that you'd get EFAULT) > > this means previously find first key was done by calling with a non existing key > (AFAICT this was hoping to get lucky, since if your non existing key actually existed, > it wouldn't actually iterate right, since you couldn't guarantee your magic value was > the first one) > > ie. do we need a BPF_MAP_GET_NEXT_KEY2 or a sysctl? some other approach? BPF_MAP_LOOKUP_BATCH has already been added to lookup in batches of buckets, so the next bpf_map_lookup_batch will start from the beginning of the next bucket. Here is the details: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com/#t
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index d29af9988f37..0d61a9a54279 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -780,7 +780,7 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) key_size = map->key_size; if (!key) - goto find_first_elem; + goto find_first_elem_this_bucket; hash = htab_map_hash(key, key_size, htab->hashrnd); @@ -790,7 +790,7 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets); if (!l) - goto find_first_elem; + goto find_first_elem_next_bucket; /* key was found, get next key in the same bucket */ next_l = hlist_nulls_entry_safe(rcu_dereference_raw(hlist_nulls_next_rcu(&l->hash_node)), @@ -802,11 +802,12 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) return 0; } +find_first_elem_next_bucket: /* no more elements in this hash list, go to the next bucket */ i = hash & (htab->n_buckets - 1); i++; -find_first_elem: +find_first_elem_this_bucket: /* iterate over buckets */ for (; i < htab->n_buckets; i++) { head = select_bucket(htab, i);
From: Maciej Żenczykowski <maze@google.com> by resuming from the bucket the key would be in if it still existed Note: AFAICT this an API change that would require some sort of guard... bpf map iteration was added all the way back in v3.18-rc4-939-g0f8e4bd8a1fc but behaviour of find first key with NULL was only done in v4.11-rc7-2042-g8fe45924387b (before that you'd get EFAULT) this means previously find first key was done by calling with a non existing key (AFAICT this was hoping to get lucky, since if your non existing key actually existed, it wouldn't actually iterate right, since you couldn't guarantee your magic value was the first one) ie. do we need a BPF_MAP_GET_NEXT_KEY2 or a sysctl? some other approach? Not-Signed-off-by-Yet: Maciej Żenczykowski <maze@google.com> --- kernel/bpf/hashtab.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)