Message ID | 20240227152740.35120-1-toke@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf] bpf: Fix DEVMAP_HASH overflow check on 32-bit arches | expand |
Toke Høiland-Jørgensen wrote: > The devmap code allocates a number hash buckets equal to the next power of two > of the max_entries value provided when creating the map. When rounding up to the > next power of two, the 32-bit variable storing the number of buckets can > overflow, and the code checks for overflow by checking if the truncated 32-bit value > is equal to 0. However, on 32-bit arches the rounding up itself can overflow > mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned > long value. If the size of an unsigned long is four bytes, this is undefined > behaviour, so there is no guarantee that we'll end up with a nice and tidy > 0-value at the end. > > Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with > max_entries > 0x80000000 and then trying to update it. Fix this by moving the > overflow check to before the rounding up operation. > > Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") > Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com > Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > --- > kernel/bpf/devmap.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index a936c704d4e7..9b2286f9c6da 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) > bpf_map_init_from_attr(&dtab->map, attr); > > if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > - dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); > - > - if (!dtab->n_buckets) /* Overflow check */ > + if (dtab->map.max_entries > U32_MAX / 2) > return -EINVAL; > - } > > - if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > + dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); > + > dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, > dtab->map.numa_node); > if (!dtab->dev_index_head) > -- > 2.43.2 > I'm fairly sure this code was just taken from the hashtab implementation. Do we also need a fix there? /* hash table size must be power of 2 */ htab->n_buckets = roundup_pow_of_two(htab->map.max_entries); The u32 check in hashtab is, /* prevent zero size kmalloc and check for u32 overflow */ if (htab->n_buckets == 0 || htab->n_buckets > U32_MAX / sizeof(struct bucket)) goto free_htab; Thanks, John
John Fastabend <john.fastabend@gmail.com> writes: > Toke Høiland-Jørgensen wrote: >> The devmap code allocates a number hash buckets equal to the next power of two >> of the max_entries value provided when creating the map. When rounding up to the >> next power of two, the 32-bit variable storing the number of buckets can >> overflow, and the code checks for overflow by checking if the truncated 32-bit value >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned >> long value. If the size of an unsigned long is four bytes, this is undefined >> behaviour, so there is no guarantee that we'll end up with a nice and tidy >> 0-value at the end. >> >> Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with >> max_entries > 0x80000000 and then trying to update it. Fix this by moving the >> overflow check to before the rounding up operation. >> >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") >> Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com >> Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >> --- >> kernel/bpf/devmap.c | 8 +++----- >> 1 file changed, 3 insertions(+), 5 deletions(-) >> >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c >> index a936c704d4e7..9b2286f9c6da 100644 >> --- a/kernel/bpf/devmap.c >> +++ b/kernel/bpf/devmap.c >> @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) >> bpf_map_init_from_attr(&dtab->map, attr); >> >> if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { >> - dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); >> - >> - if (!dtab->n_buckets) /* Overflow check */ >> + if (dtab->map.max_entries > U32_MAX / 2) >> return -EINVAL; >> - } >> >> - if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { >> + dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); >> + >> dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, >> dtab->map.numa_node); >> if (!dtab->dev_index_head) >> -- >> 2.43.2 >> > > I'm fairly sure this code was just taken from the hashtab implementation. Yup, it was :) > Do we also need a fix there? > > /* hash table size must be power of 2 */ > htab->n_buckets = roundup_pow_of_two(htab->map.max_entries); > > The u32 check in hashtab is, > > /* prevent zero size kmalloc and check for u32 overflow */ > if (htab->n_buckets == 0 || > htab->n_buckets > U32_MAX / sizeof(struct bucket)) > goto free_htab; Yeah, I don't see any reason why this wouldn't be susceptible to the same issue. I'll send a follow-up to apply the same fix there. -Toke
Toke Høiland-Jørgensen wrote: > John Fastabend <john.fastabend@gmail.com> writes: > > > Toke Høiland-Jørgensen wrote: > >> The devmap code allocates a number hash buckets equal to the next power of two > >> of the max_entries value provided when creating the map. When rounding up to the > >> next power of two, the 32-bit variable storing the number of buckets can > >> overflow, and the code checks for overflow by checking if the truncated 32-bit value > >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow > >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned > >> long value. If the size of an unsigned long is four bytes, this is undefined > >> behaviour, so there is no guarantee that we'll end up with a nice and tidy > >> 0-value at the end. Hi Toke, dumb question where is this left-shift noted above? It looks like fls_long tries to account by having a check for sizeof(l) == 4. I'm asking mostly because I've found a few more spots without this check. > >> > >> Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with > >> max_entries > 0x80000000 and then trying to update it. Fix this by moving the > >> overflow check to before the rounding up operation. > >> > >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") > >> Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com > >> Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com > >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > >> --- > >> kernel/bpf/devmap.c | 8 +++----- > >> 1 file changed, 3 insertions(+), 5 deletions(-) > >> > >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > >> index a936c704d4e7..9b2286f9c6da 100644 > >> --- a/kernel/bpf/devmap.c > >> +++ b/kernel/bpf/devmap.c > >> @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) > >> bpf_map_init_from_attr(&dtab->map, attr); > >> > >> if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > >> - dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); > >> - > >> - if (!dtab->n_buckets) /* Overflow check */ > >> + if (dtab->map.max_entries > U32_MAX / 2) > >> return -EINVAL; > >> - } > >> > >> - if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > >> + dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); > >> + > >> dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, > >> dtab->map.numa_node); > >> if (!dtab->dev_index_head) > >> -- > >> 2.43.2 > >> > > > > I'm fairly sure this code was just taken from the hashtab implementation. > > Yup, it was :) > > > Do we also need a fix there? > > > > /* hash table size must be power of 2 */ > > htab->n_buckets = roundup_pow_of_two(htab->map.max_entries); > > > > The u32 check in hashtab is, > > > > /* prevent zero size kmalloc and check for u32 overflow */ > > if (htab->n_buckets == 0 || > > htab->n_buckets > U32_MAX / sizeof(struct bucket)) > > goto free_htab; > > Yeah, I don't see any reason why this wouldn't be susceptible to the > same issue. I'll send a follow-up to apply the same fix there. Cool thanks one more question above. > > -Toke >
John Fastabend <john.fastabend@gmail.com> writes: > Toke Høiland-Jørgensen wrote: >> John Fastabend <john.fastabend@gmail.com> writes: >> >> > Toke Høiland-Jørgensen wrote: >> >> The devmap code allocates a number hash buckets equal to the next power of two >> >> of the max_entries value provided when creating the map. When rounding up to the >> >> next power of two, the 32-bit variable storing the number of buckets can >> >> overflow, and the code checks for overflow by checking if the truncated 32-bit value >> >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow >> >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned >> >> long value. If the size of an unsigned long is four bytes, this is undefined >> >> behaviour, so there is no guarantee that we'll end up with a nice and tidy >> >> 0-value at the end. > > Hi Toke, dumb question where is this left-shift noted above? It looks > like fls_long tries to account by having a check for sizeof(l) == 4. > I'm asking mostly because I've found a few more spots without this > check. That check in fls_long only switches between too different implementations of the fls op itself (fls() vs fls64()). AFAICT this is mostly meaningful for the generic (non-ASM) version that iterates over the bits instead of just emitting a single instruction. The shift is in the caller: static inline __attribute__((const)) unsigned long __roundup_pow_of_two(unsigned long n) { return 1UL << fls_long(n - 1); } If this is called with a value > 0x80000000, fls_long() will (correctly) return 32, leading to the ub[0] shift when sizeof(unsigned long) == 4. -Toke [0] https://wiki.sei.cmu.edu/confluence/display/c/int34-c.+do+not+shift+an+expression+by+a+negative+number+of+bits+or+by+greater+than+or+equal+to+the+number+of+bits+that+exist+in+the+operand
Toke Høiland-Jørgensen wrote: > John Fastabend <john.fastabend@gmail.com> writes: > > > Toke Høiland-Jørgensen wrote: > >> John Fastabend <john.fastabend@gmail.com> writes: > >> > >> > Toke Høiland-Jørgensen wrote: > >> >> The devmap code allocates a number hash buckets equal to the next power of two > >> >> of the max_entries value provided when creating the map. When rounding up to the > >> >> next power of two, the 32-bit variable storing the number of buckets can > >> >> overflow, and the code checks for overflow by checking if the truncated 32-bit value > >> >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow > >> >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned > >> >> long value. If the size of an unsigned long is four bytes, this is undefined > >> >> behaviour, so there is no guarantee that we'll end up with a nice and tidy > >> >> 0-value at the end. > > > > Hi Toke, dumb question where is this left-shift noted above? It looks > > like fls_long tries to account by having a check for sizeof(l) == 4. > > I'm asking mostly because I've found a few more spots without this > > check. > > That check in fls_long only switches between too different > implementations of the fls op itself (fls() vs fls64()). AFAICT this is > mostly meaningful for the generic (non-ASM) version that iterates over > the bits instead of just emitting a single instruction. > > The shift is in the caller: > > static inline __attribute__((const)) > unsigned long __roundup_pow_of_two(unsigned long n) > { > return 1UL << fls_long(n - 1); > } > > If this is called with a value > 0x80000000, fls_long() will (correctly) > return 32, leading to the ub[0] shift when sizeof(unsigned long) == 4. Yep thanks I was looking in fls_long there walked past the pow-of_two() bits. Thanks.
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index a936c704d4e7..9b2286f9c6da 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr) bpf_map_init_from_attr(&dtab->map, attr); if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { - dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); - - if (!dtab->n_buckets) /* Overflow check */ + if (dtab->map.max_entries > U32_MAX / 2) return -EINVAL; - } - if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { + dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries); + dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets, dtab->map.numa_node); if (!dtab->dev_index_head)
The devmap code allocates a number hash buckets equal to the next power of two of the max_entries value provided when creating the map. When rounding up to the next power of two, the 32-bit variable storing the number of buckets can overflow, and the code checks for overflow by checking if the truncated 32-bit value is equal to 0. However, on 32-bit arches the rounding up itself can overflow mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned long value. If the size of an unsigned long is four bytes, this is undefined behaviour, so there is no guarantee that we'll end up with a nice and tidy 0-value at the end. Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with max_entries > 0x80000000 and then trying to update it. Fix this by moving the overflow check to before the rounding up operation. Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- kernel/bpf/devmap.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)