diff mbox series

[net] ipv6/gro: fix an out of bounds memory bug in ipv6_gro_receive()

Message ID 20221027102449.926410-1-william.xuanziyang@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] ipv6/gro: fix an out of bounds memory bug in ipv6_gro_receive() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Ziyang Xuan (William) Oct. 27, 2022, 10:24 a.m. UTC
IPv6 packets without NEXTHDR_NONE extension header can make continuous
__skb_pull() until pskb_may_pull() failed in ipv6_gso_pull_exthdrs().
That results in a big value of skb_gro_offset(), and after __skb_push()
in ipv6_gro_receive(), skb->data will less than skb->head, an out of
bounds memory bug occurs. That will trigger the problem as following:

==================================================================
BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260
...
Call trace:
 dump_backtrace+0xd8/0x130
 show_stack+0x1c/0x50
 dump_stack_lvl+0x64/0x7c
 print_address_description.constprop.0+0xbc/0x2e8
 print_report+0x100/0x1e4
 kasan_report+0x80/0x120
 __asan_load8+0x78/0xa0
 eth_type_trans+0x100/0x260
 napi_gro_frags+0x164/0x550
 tun_get_user+0xda4/0x1270
 tun_chr_write_iter+0x74/0x130
 do_iter_readv_writev+0x130/0x1ec
 do_iter_write+0xbc/0x1e0
 vfs_writev+0x13c/0x26c

Add comparison between skb->data - skb_gro_offset() and skb->head
and exception handler before __skb_push() to fix the bug.

Fixes: 86911732d399 ("gro: Avoid copying headers of unmerged packets")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
---
 net/ipv6/ip6_offload.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Eric Dumazet Oct. 27, 2022, 11:39 a.m. UTC | #1
On Thu, Oct 27, 2022 at 3:25 AM Ziyang Xuan
<william.xuanziyang@huawei.com> wrote:
>
> IPv6 packets without NEXTHDR_NONE extension header can make continuous
> __skb_pull() until pskb_may_pull() failed in ipv6_gso_pull_exthdrs().
> That results in a big value of skb_gro_offset(), and after __skb_push()
> in ipv6_gro_receive(), skb->data will less than skb->head, an out of
> bounds memory bug occurs. That will trigger the problem as following:
>
> ==================================================================
> BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260
> ...
> Call trace:
>  dump_backtrace+0xd8/0x130
>  show_stack+0x1c/0x50
>  dump_stack_lvl+0x64/0x7c
>  print_address_description.constprop.0+0xbc/0x2e8
>  print_report+0x100/0x1e4
>  kasan_report+0x80/0x120
>  __asan_load8+0x78/0xa0
>  eth_type_trans+0x100/0x260

Crash happens from eth_type_trans() , this should happen before
ipv6_gro_receive() ?

It seems your patch is unrelated.

Please provide a repro.


>  napi_gro_frags+0x164/0x550
>  tun_get_user+0xda4/0x1270
>  tun_chr_write_iter+0x74/0x130
>  do_iter_readv_writev+0x130/0x1ec
>  do_iter_write+0xbc/0x1e0
>  vfs_writev+0x13c/0x26c
>
> Add comparison between skb->data - skb_gro_offset() and skb->head
> and exception handler before __skb_push() to fix the bug.
>
> Fixes: 86911732d399 ("gro: Avoid copying headers of unmerged packets")
> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> ---
>  net/ipv6/ip6_offload.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
> index 3ee345672849..6659ccf25387 100644
> --- a/net/ipv6/ip6_offload.c
> +++ b/net/ipv6/ip6_offload.c
> @@ -237,6 +237,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>                 proto = ipv6_gso_pull_exthdrs(skb, proto);
>                 skb_gro_pull(skb, -skb_transport_offset(skb));
>                 skb_reset_transport_header(skb);
> +               if (unlikely(skb_headroom(skb) < skb_gro_offset(skb))) {

This makes no sense to me.

If there is a bug, it should be fixed earlier.

> +                       kfree_skb(skb);
> +                       return ERR_PTR(-EINPROGRESS);
> +               }
>                 __skb_push(skb, skb_gro_offset(skb));
>
>                 ops = rcu_dereference(inet6_offloads[proto]);
> --
> 2.25.1
>
Ziyang Xuan (William) Oct. 27, 2022, 1 p.m. UTC | #2
> On Thu, Oct 27, 2022 at 3:25 AM Ziyang Xuan
> <william.xuanziyang@huawei.com> wrote:
>>
>> IPv6 packets without NEXTHDR_NONE extension header can make continuous
>> __skb_pull() until pskb_may_pull() failed in ipv6_gso_pull_exthdrs().
>> That results in a big value of skb_gro_offset(), and after __skb_push()
>> in ipv6_gro_receive(), skb->data will less than skb->head, an out of
>> bounds memory bug occurs. That will trigger the problem as following:
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260
>> ...
>> Call trace:
>>  dump_backtrace+0xd8/0x130
>>  show_stack+0x1c/0x50
>>  dump_stack_lvl+0x64/0x7c
>>  print_address_description.constprop.0+0xbc/0x2e8
>>  print_report+0x100/0x1e4
>>  kasan_report+0x80/0x120
>>  __asan_load8+0x78/0xa0
>>  eth_type_trans+0x100/0x260
> 
> Crash happens from eth_type_trans() , this should happen before
> ipv6_gro_receive() ?
> 
> It seems your patch is unrelated.
> 
> Please provide a repro.

C repro put in attachment.

> 
> 
>>  napi_gro_frags+0x164/0x550
>>  tun_get_user+0xda4/0x1270
>>  tun_chr_write_iter+0x74/0x130
>>  do_iter_readv_writev+0x130/0x1ec
>>  do_iter_write+0xbc/0x1e0
>>  vfs_writev+0x13c/0x26c
>>
>> Add comparison between skb->data - skb_gro_offset() and skb->head
>> and exception handler before __skb_push() to fix the bug.
>>
>> Fixes: 86911732d399 ("gro: Avoid copying headers of unmerged packets")
>> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
>> ---
>>  net/ipv6/ip6_offload.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
>> index 3ee345672849..6659ccf25387 100644
>> --- a/net/ipv6/ip6_offload.c
>> +++ b/net/ipv6/ip6_offload.c
>> @@ -237,6 +237,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>>                 proto = ipv6_gso_pull_exthdrs(skb, proto);
>>                 skb_gro_pull(skb, -skb_transport_offset(skb));
>>                 skb_reset_transport_header(skb);
>> +               if (unlikely(skb_headroom(skb) < skb_gro_offset(skb))) {
> 
> This makes no sense to me.
> 
> If there is a bug, it should be fixed earlier.

Maybe it is good to validate IPv6 packet earlier in ipv6_gro_receive() or more earlier?

> 
>> +                       kfree_skb(skb);
>> +                       return ERR_PTR(-EINPROGRESS);
>> +               }
>>                 __skb_push(skb, skb_gro_offset(skb));
>>
>>                 ops = rcu_dereference(inet6_offloads[proto]);
>> --
>> 2.25.1
>>
> .
>
// https://syzkaller.appspot.com/bug?id=7646a2204f385fb619275e834be62c2a4c422f13
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <arpa/inet.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <net/if_arp.h>
#include <netinet/in.h>
#include <sched.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <sys/wait.h>
#include <unistd.h>

#include <linux/capability.h>
#include <linux/genetlink.h>
#include <linux/if_addr.h>
#include <linux/if_ether.h>
#include <linux/if_link.h>
#include <linux/if_tun.h>
#include <linux/in6.h>
#include <linux/ip.h>
#include <linux/neighbour.h>
#include <linux/net.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/tcp.h>
#include <linux/veth.h>

#define BITMASK(bf_off, bf_len) (((1ull << (bf_len)) - 1) << (bf_off))
#define STORE_BY_BITMASK(type, htobe, addr, val, bf_off, bf_len)               \
  *(type*)(addr) =                                                             \
      htobe((htobe(*(type*)(addr)) & ~BITMASK((bf_off), (bf_len))) |           \
            (((type)(val) << (bf_off)) & BITMASK((bf_off), (bf_len))))

static bool write_file(const char* file, const char* what, ...)
{
  char buf[1024];
  va_list args;
  va_start(args, what);
  vsnprintf(buf, sizeof(buf), what, args);
  va_end(args);
  buf[sizeof(buf) - 1] = 0;
  int len = strlen(buf);
  int fd = open(file, O_WRONLY | O_CLOEXEC);
  if (fd == -1)
    return false;
  if (write(fd, buf, len) != len) {
    int err = errno;
    close(fd);
    errno = err;
    return false;
  }
  close(fd);
  return true;
}

struct nlmsg {
  char* pos;
  int nesting;
  struct nlattr* nested[8];
  char buf[1024];
};

static struct nlmsg nlmsg;

static void netlink_init(struct nlmsg* nlmsg, int typ, int flags,
                         const void* data, int size)
{
  memset(nlmsg, 0, sizeof(*nlmsg));
  struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
  hdr->nlmsg_type = typ;
  hdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags;
  memcpy(hdr + 1, data, size);
  nlmsg->pos = (char*)(hdr + 1) + NLMSG_ALIGN(size);
}

static void netlink_attr(struct nlmsg* nlmsg, int typ, const void* data,
                         int size)
{
  struct nlattr* attr = (struct nlattr*)nlmsg->pos;
  attr->nla_len = sizeof(*attr) + size;
  attr->nla_type = typ;
  memcpy(attr + 1, data, size);
  nlmsg->pos += NLMSG_ALIGN(attr->nla_len);
}

static int netlink_send_ext(struct nlmsg* nlmsg, int sock, uint16_t reply_type,
                            int* reply_len)
{
  if (nlmsg->pos > nlmsg->buf + sizeof(nlmsg->buf) || nlmsg->nesting)
    exit(1);
  struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
  hdr->nlmsg_len = nlmsg->pos - nlmsg->buf;
  struct sockaddr_nl addr;
  memset(&addr, 0, sizeof(addr));
  addr.nl_family = AF_NETLINK;
  unsigned n = sendto(sock, nlmsg->buf, hdr->nlmsg_len, 0,
                      (struct sockaddr*)&addr, sizeof(addr));
  if (n != hdr->nlmsg_len)
    exit(1);
  n = recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0);
  if (hdr->nlmsg_type == NLMSG_DONE) {
    *reply_len = 0;
    return 0;
  }
  if (n < sizeof(struct nlmsghdr))
    exit(1);
  if (reply_len && hdr->nlmsg_type == reply_type) {
    *reply_len = n;
    return 0;
  }
  if (n < sizeof(struct nlmsghdr) + sizeof(struct nlmsgerr))
    exit(1);
  if (hdr->nlmsg_type != NLMSG_ERROR)
    exit(1);
  return -((struct nlmsgerr*)(hdr + 1))->error;
}

static int netlink_send(struct nlmsg* nlmsg, int sock)
{
  return netlink_send_ext(nlmsg, sock, 0, NULL);
}

static int netlink_next_msg(struct nlmsg* nlmsg, unsigned int offset,
                            unsigned int total_len)
{
  struct nlmsghdr* hdr = (struct nlmsghdr*)(nlmsg->buf + offset);
  if (offset == total_len || offset + hdr->nlmsg_len > total_len)
    return -1;
  return hdr->nlmsg_len;
}

static void netlink_device_change(struct nlmsg* nlmsg, int sock,
                                  const char* name, bool up, const char* master,
                                  const void* mac, int macsize,
                                  const char* new_name)
{
  struct ifinfomsg hdr;
  memset(&hdr, 0, sizeof(hdr));
  if (up)
    hdr.ifi_flags = hdr.ifi_change = IFF_UP;
  hdr.ifi_index = if_nametoindex(name);
  netlink_init(nlmsg, RTM_NEWLINK, 0, &hdr, sizeof(hdr));
  if (new_name)
    netlink_attr(nlmsg, IFLA_IFNAME, new_name, strlen(new_name));
  if (master) {
    int ifindex = if_nametoindex(master);
    netlink_attr(nlmsg, IFLA_MASTER, &ifindex, sizeof(ifindex));
  }
  if (macsize)
    netlink_attr(nlmsg, IFLA_ADDRESS, mac, macsize);
  int err = netlink_send(nlmsg, sock);
  (void)err;
}

static int netlink_add_addr(struct nlmsg* nlmsg, int sock, const char* dev,
                            const void* addr, int addrsize)
{
  struct ifaddrmsg hdr;
  memset(&hdr, 0, sizeof(hdr));
  hdr.ifa_family = addrsize == 4 ? AF_INET : AF_INET6;
  hdr.ifa_prefixlen = addrsize == 4 ? 24 : 120;
  hdr.ifa_scope = RT_SCOPE_UNIVERSE;
  hdr.ifa_index = if_nametoindex(dev);
  netlink_init(nlmsg, RTM_NEWADDR, NLM_F_CREATE | NLM_F_REPLACE, &hdr,
               sizeof(hdr));
  netlink_attr(nlmsg, IFA_LOCAL, addr, addrsize);
  netlink_attr(nlmsg, IFA_ADDRESS, addr, addrsize);
  return netlink_send(nlmsg, sock);
}

static void netlink_add_addr4(struct nlmsg* nlmsg, int sock, const char* dev,
                              const char* addr)
{
  struct in_addr in_addr;
  inet_pton(AF_INET, addr, &in_addr);
  int err = netlink_add_addr(nlmsg, sock, dev, &in_addr, sizeof(in_addr));
  (void)err;
}

static void netlink_add_addr6(struct nlmsg* nlmsg, int sock, const char* dev,
                              const char* addr)
{
  struct in6_addr in6_addr;
  inet_pton(AF_INET6, addr, &in6_addr);
  int err = netlink_add_addr(nlmsg, sock, dev, &in6_addr, sizeof(in6_addr));
  (void)err;
}

static void netlink_add_neigh(struct nlmsg* nlmsg, int sock, const char* name,
                              const void* addr, int addrsize, const void* mac,
                              int macsize)
{
  struct ndmsg hdr;
  memset(&hdr, 0, sizeof(hdr));
  hdr.ndm_family = addrsize == 4 ? AF_INET : AF_INET6;
  hdr.ndm_ifindex = if_nametoindex(name);
  hdr.ndm_state = NUD_PERMANENT;
  netlink_init(nlmsg, RTM_NEWNEIGH, NLM_F_EXCL | NLM_F_CREATE, &hdr,
               sizeof(hdr));
  netlink_attr(nlmsg, NDA_DST, addr, addrsize);
  netlink_attr(nlmsg, NDA_LLADDR, mac, macsize);
  int err = netlink_send(nlmsg, sock);
  (void)err;
}

static int tunfd = -1;
static int tun_frags_enabled;

#define TUN_IFACE "syz_tun"

#define LOCAL_MAC 0xaaaaaaaaaaaa
#define REMOTE_MAC 0xaaaaaaaaaabb

#define LOCAL_IPV4 "172.20.20.170"
#define REMOTE_IPV4 "172.20.20.187"

#define LOCAL_IPV6 "fe80::aa"
#define REMOTE_IPV6 "fe80::bb"

#define IFF_NAPI 0x0010
#define IFF_NAPI_FRAGS 0x0020

static void initialize_tun(void)
{
  tunfd = open("/dev/net/tun", O_RDWR | O_NONBLOCK);
  if (tunfd == -1) {
    printf("tun: can't open /dev/net/tun: please enable CONFIG_TUN=y\n");
    printf("otherwise fuzzing or reproducing might not work as intended\n");
    return;
  }
  const int kTunFd = 240;
  if (dup2(tunfd, kTunFd) < 0)
    exit(1);
  close(tunfd);
  tunfd = kTunFd;
  struct ifreq ifr;
  memset(&ifr, 0, sizeof(ifr));
  strncpy(ifr.ifr_name, TUN_IFACE, IFNAMSIZ);
  ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_NAPI | IFF_NAPI_FRAGS;
  if (ioctl(tunfd, TUNSETIFF, (void*)&ifr) < 0) {
    ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
    if (ioctl(tunfd, TUNSETIFF, (void*)&ifr) < 0)
      exit(1);
  }
  if (ioctl(tunfd, TUNGETIFF, (void*)&ifr) < 0)
    exit(1);
  tun_frags_enabled = (ifr.ifr_flags & IFF_NAPI_FRAGS) != 0;
  char sysctl[64];
//  sprintf(sysctl, "/proc/sys/net/ipv6/conf/%s/accept_dad", TUN_IFACE);
//  write_file(sysctl, "0");
//  sprintf(sysctl, "/proc/sys/net/ipv6/conf/%s/router_solicitations", TUN_IFACE);
//  write_file(sysctl, "0");
  int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
  if (sock == -1)
    exit(1);
  netlink_add_addr4(&nlmsg, sock, TUN_IFACE, LOCAL_IPV4);
  netlink_add_addr6(&nlmsg, sock, TUN_IFACE, LOCAL_IPV6);
  uint64_t macaddr = REMOTE_MAC;
  struct in_addr in_addr;
  inet_pton(AF_INET, REMOTE_IPV4, &in_addr);
  netlink_add_neigh(&nlmsg, sock, TUN_IFACE, &in_addr, sizeof(in_addr),
                    &macaddr, ETH_ALEN);
  struct in6_addr in6_addr;
  inet_pton(AF_INET6, REMOTE_IPV6, &in6_addr);
  netlink_add_neigh(&nlmsg, sock, TUN_IFACE, &in6_addr, sizeof(in6_addr),
                    &macaddr, ETH_ALEN);
  macaddr = LOCAL_MAC;
  netlink_device_change(&nlmsg, sock, TUN_IFACE, true, 0, &macaddr, ETH_ALEN,
                        NULL);
  close(sock);
}

const int kInitNetNsFd = 239;

#define DEVLINK_FAMILY_NAME "devlink"

#define DEVLINK_CMD_PORT_GET 5
#define DEVLINK_CMD_RELOAD 37
#define DEVLINK_ATTR_BUS_NAME 1
#define DEVLINK_ATTR_DEV_NAME 2
#define DEVLINK_ATTR_NETDEV_NAME 7
#define DEVLINK_ATTR_NETNS_FD 138

static int netlink_devlink_id_get(struct nlmsg* nlmsg, int sock)
{
  struct genlmsghdr genlhdr;
  struct nlattr* attr;
  int err, n;
  uint16_t id = 0;
  memset(&genlhdr, 0, sizeof(genlhdr));
  genlhdr.cmd = CTRL_CMD_GETFAMILY;
  netlink_init(nlmsg, GENL_ID_CTRL, 0, &genlhdr, sizeof(genlhdr));
  netlink_attr(nlmsg, CTRL_ATTR_FAMILY_NAME, DEVLINK_FAMILY_NAME,
               strlen(DEVLINK_FAMILY_NAME) + 1);
  err = netlink_send_ext(nlmsg, sock, GENL_ID_CTRL, &n);
  if (err) {
    return -1;
  }
  attr = (struct nlattr*)(nlmsg->buf + NLMSG_HDRLEN +
                          NLMSG_ALIGN(sizeof(genlhdr)));
  for (; (char*)attr < nlmsg->buf + n;
       attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
    if (attr->nla_type == CTRL_ATTR_FAMILY_ID) {
      id = *(uint16_t*)(attr + 1);
      break;
    }
  }
  if (!id) {
    return -1;
  }
  recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0); /* recv ack */
  return id;
}

static void netlink_devlink_netns_move(const char* bus_name,
                                       const char* dev_name, int netns_fd)
{
  struct genlmsghdr genlhdr;
  int sock;
  int id, err;
  sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
  if (sock == -1)
    exit(1);
  id = netlink_devlink_id_get(&nlmsg, sock);
  if (id == -1)
    goto error;
  memset(&genlhdr, 0, sizeof(genlhdr));
  genlhdr.cmd = DEVLINK_CMD_RELOAD;
  netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
  netlink_attr(&nlmsg, DEVLINK_ATTR_BUS_NAME, bus_name, strlen(bus_name) + 1);
  netlink_attr(&nlmsg, DEVLINK_ATTR_DEV_NAME, dev_name, strlen(dev_name) + 1);
  netlink_attr(&nlmsg, DEVLINK_ATTR_NETNS_FD, &netns_fd, sizeof(netns_fd));
  err = netlink_send(&nlmsg, sock);
  if (err) {
  }
error:
  close(sock);
}

static struct nlmsg nlmsg2;

static void initialize_devlink_ports(const char* bus_name, const char* dev_name,
                                     const char* netdev_prefix)
{
  struct genlmsghdr genlhdr;
  int len, total_len, id, err, offset;
  uint16_t netdev_index;
  int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
  if (sock == -1)
    exit(1);
  int rtsock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
  if (rtsock == -1)
    exit(1);
  id = netlink_devlink_id_get(&nlmsg, sock);
  if (id == -1)
    goto error;
  memset(&genlhdr, 0, sizeof(genlhdr));
  genlhdr.cmd = DEVLINK_CMD_PORT_GET;
  netlink_init(&nlmsg, id, NLM_F_DUMP, &genlhdr, sizeof(genlhdr));
  netlink_attr(&nlmsg, DEVLINK_ATTR_BUS_NAME, bus_name, strlen(bus_name) + 1);
  netlink_attr(&nlmsg, DEVLINK_ATTR_DEV_NAME, dev_name, strlen(dev_name) + 1);
  err = netlink_send_ext(&nlmsg, sock, id, &total_len);
  if (err) {
    goto error;
  }
  offset = 0;
  netdev_index = 0;
  while ((len = netlink_next_msg(&nlmsg, offset, total_len)) != -1) {
    struct nlattr* attr = (struct nlattr*)(nlmsg.buf + offset + NLMSG_HDRLEN +
                                           NLMSG_ALIGN(sizeof(genlhdr)));
    for (; (char*)attr < nlmsg.buf + offset + len;
         attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
      if (attr->nla_type == DEVLINK_ATTR_NETDEV_NAME) {
        char* port_name;
        char netdev_name[IFNAMSIZ];
        port_name = (char*)(attr + 1);
        snprintf(netdev_name, sizeof(netdev_name), "%s%d", netdev_prefix,
                 netdev_index);
        netlink_device_change(&nlmsg2, rtsock, port_name, true, 0, 0, 0,
                              netdev_name);
        break;
      }
    }
    offset += len;
    netdev_index++;
  }
error:
  close(rtsock);
  close(sock);
}

static void initialize_devlink_pci(void)
{
  int netns = open("/proc/self/ns/net", O_RDONLY);
  if (netns == -1)
    exit(1);
  int ret = setns(kInitNetNsFd, 0);
  if (ret == -1)
    exit(1);
  netlink_devlink_netns_move("pci", "0000:00:10.0", netns);
  ret = setns(netns, 0);
  if (ret == -1)
    exit(1);
  close(netns);
  initialize_devlink_ports("pci", "0000:00:10.0", "netpci");
}

#define MAX_FRAGS 4
struct vnet_fragmentation {
  uint32_t full;
  uint32_t count;
  uint32_t frags[MAX_FRAGS];
};

static long syz_emit_ethernet(volatile long a0, volatile long a1,
                              volatile long a2)
{
  if (tunfd < 0)
    return (uintptr_t)-1;
  uint32_t length = a0;
  char* data = (char*)a1;
  struct vnet_fragmentation* frags = (struct vnet_fragmentation*)a2;
  struct iovec vecs[MAX_FRAGS + 1];
  uint32_t nfrags = 0;
  if (!tun_frags_enabled || frags == NULL) {
    vecs[nfrags].iov_base = data;
    vecs[nfrags].iov_len = length;
    nfrags++;
  } else {
    bool full = true;
    uint32_t i, count = 0;
    full = frags->full;
    count = frags->count;
    if (count > MAX_FRAGS)
      count = MAX_FRAGS;
    for (i = 0; i < count && length != 0; i++) {
      uint32_t size = 0;
      size = frags->frags[i];
      if (size > length)
        size = length;
      vecs[nfrags].iov_base = data;
      vecs[nfrags].iov_len = size;
      nfrags++;
      data += size;
      length -= size;
    }
    if (length != 0 && (full || nfrags == 0)) {
      vecs[nfrags].iov_base = data;
      vecs[nfrags].iov_len = length;
      nfrags++;
    }
  }
  return writev(tunfd, vecs, nfrags);
}

static void setup_common()
{
  if (mount(0, "/sys/fs/fuse/connections", "fusectl", 0, 0)) {
  }
}

static void loop();

static void sandbox_common()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  setsid();
  int netns = open("/proc/self/ns/net", O_RDONLY);
  if (netns == -1)
    exit(1);
  if (dup2(netns, kInitNetNsFd) < 0)
    exit(1);
  close(netns);
}

int wait_for_loop(int pid)
{
  if (pid < 0)
    exit(1);
  int status = 0;
  while (waitpid(-1, &status, __WALL) != pid) {
  }
  return WEXITSTATUS(status);
}

static void drop_caps(void)
{
  struct __user_cap_header_struct cap_hdr = {};
  struct __user_cap_data_struct cap_data[2] = {};
  cap_hdr.version = _LINUX_CAPABILITY_VERSION_3;
  cap_hdr.pid = getpid();
  if (syscall(SYS_capget, &cap_hdr, &cap_data))
    exit(1);
  const int drop = (1 << CAP_SYS_PTRACE) | (1 << CAP_SYS_NICE);
  cap_data[0].effective &= ~drop;
  cap_data[0].permitted &= ~drop;
  cap_data[0].inheritable &= ~drop;
  if (syscall(SYS_capset, &cap_hdr, &cap_data))
    exit(1);
}

static int do_sandbox_none(void)
{
  sandbox_common();
  initialize_tun();
  sleep(5);
  loop();
  exit(1);
}

void loop(void)
{
  *(uint32_t*)0x2001d000 = 1;
  *(uint32_t*)0x2001d004 = 0x70;
  *(uint8_t*)0x2001d008 = 0;
  *(uint8_t*)0x2001d009 = 0;
  *(uint8_t*)0x2001d00a = 0;
  *(uint8_t*)0x2001d00b = 0;
  *(uint32_t*)0x2001d00c = 0;
  *(uint64_t*)0x2001d010 = 0xe;
  *(uint64_t*)0x2001d018 = 0;
  *(uint64_t*)0x2001d020 = 0;
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 0, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 1, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 2, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 3, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 4, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0x81, 5, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 6, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 7, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 8, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 9, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 10, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 11, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 12, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 13, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 14, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 3, 15, 2);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 17, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 18, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 19, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 20, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 21, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 22, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 23, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 24, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 25, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 26, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 27, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 28, 1);
  STORE_BY_BITMASK(uint64_t, , 0x2001d028, 0, 29, 35);
  *(uint32_t*)0x2001d030 = 0;
  *(uint32_t*)0x2001d034 = 0;
  *(uint64_t*)0x2001d038 = 0;
  *(uint64_t*)0x2001d040 = 0xd429c0fd682991c2;
  *(uint64_t*)0x2001d048 = 0x10010;
  *(uint64_t*)0x2001d050 = 0;
  *(uint32_t*)0x2001d058 = 0;
  *(uint32_t*)0x2001d05c = 0;
  *(uint64_t*)0x2001d060 = 0;
  *(uint32_t*)0x2001d068 = 0;
  *(uint16_t*)0x2001d06c = 0;
  *(uint16_t*)0x2001d06e = 0;
//  syscall(__NR_perf_event_open, 0x2001d000ul, 0, -1ul, -1, 0ul);
  memcpy((void*)0x20000200,
         "\xaa\xaa\xae\xaa\xaa\xaa\x00\x00\x00\x00\x00\x00\x86\xdd\x60\xb4\x09"
         "\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\xff\xff"
         "\xe0\x00\x00\x02\x3e\x02\x00\x00\x00\x00\x00\x00\x01\x83\x00\x90\x78"
         "\x00\x09\x04\x00\x60\xb6\x80\xde\x00\x00\x00\x00\x00\x00\x00\x00\x00"
         "\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00"
         "\x00\x00\x00\x00\xff\xff\xac\x14\x08\xbb\x00\x00\x00\x00\x00\x00\x00"
         "\x7c\x69\x58\xe0\xd0\x9d\xd0\x89\xa5\x3b\xf2\x09\x61\xee\x5f\xcf\xd5"
         "\xb4\xcc\xb5\xdf\xbd\x8c\x7c\x57\xb7\x12\x15\xde\x6e\x84\xb0\xfb\x52"
         "\x04\x4c\x5d\x87\x91\x51\xe5\x30\x12\xa0\x3f\x23\x97\x17\xb2\x41\x26"
         "\x6b\x92\x79\x20\x0e\x80\xd1\x49\x83\x18\x7e\x9f\xc4\xf4\x20\x1e\x92"
         "\x34\x07\x98\x9a\xd1\x35\xbd\x20\x67\xe5\x6a\xb9\xf4\x38\xb3\x39\x75"
         "\xe4\xb8\x01\x77\x25\x6f\xf1\x25\x82\x20\xca\x36\xfe\xb6\x71\x41\x40"
         "\xf4\x7f\x85\x7d\x7e\x6e\xee\xd8\x6f\x7b\x11\xc5\x12\x3b\xdb\x6f\x07"
         "\x00\x4b\x3f\x04\x57\x96\x0e\xcc\x91\x3e\xdf\xef\xea\x0a\x39\x8b\x4d"
         "\x0d\xde\xd0\x8e\x8e\x5e",
         244);
  syz_emit_ethernet(0x207843, 0x20000200, 0);
}
int main(void)
{
  syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 3ul, 0x32ul, -1, 0);
  do_sandbox_none();
  return 0;
}
Eric Dumazet Oct. 27, 2022, 1:39 p.m. UTC | #3
On Thu, Oct 27, 2022 at 6:01 AM Ziyang Xuan (William)
<william.xuanziyang@huawei.com> wrote:
>
> > On Thu, Oct 27, 2022 at 3:25 AM Ziyang Xuan
> > <william.xuanziyang@huawei.com> wrote:
> >>
> >> IPv6 packets without NEXTHDR_NONE extension header can make continuous
> >> __skb_pull() until pskb_may_pull() failed in ipv6_gso_pull_exthdrs().
> >> That results in a big value of skb_gro_offset(), and after __skb_push()
> >> in ipv6_gro_receive(), skb->data will less than skb->head, an out of
> >> bounds memory bug occurs. That will trigger the problem as following:
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260
> >> ...
> >> Call trace:
> >>  dump_backtrace+0xd8/0x130
> >>  show_stack+0x1c/0x50
> >>  dump_stack_lvl+0x64/0x7c
> >>  print_address_description.constprop.0+0xbc/0x2e8
> >>  print_report+0x100/0x1e4
> >>  kasan_report+0x80/0x120
> >>  __asan_load8+0x78/0xa0
> >>  eth_type_trans+0x100/0x260
> >
> > Crash happens from eth_type_trans() , this should happen before
> > ipv6_gro_receive() ?
> >
> > It seems your patch is unrelated.
> >
> > Please provide a repro.
>
> C repro put in attachment.

This seems to be a bug in tun device.

Please take more time to root cause this issue, instead of adding work
arounds all over the place.

Thanks.

>
> >
> >
> >>  napi_gro_frags+0x164/0x550
> >>  tun_get_user+0xda4/0x1270
> >>  tun_chr_write_iter+0x74/0x130
> >>  do_iter_readv_writev+0x130/0x1ec
> >>  do_iter_write+0xbc/0x1e0
> >>  vfs_writev+0x13c/0x26c
> >>
> >> Add comparison between skb->data - skb_gro_offset() and skb->head
> >> and exception handler before __skb_push() to fix the bug.
> >>
> >> Fixes: 86911732d399 ("gro: Avoid copying headers of unmerged packets")
> >> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> >> ---
> >>  net/ipv6/ip6_offload.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
> >> index 3ee345672849..6659ccf25387 100644
> >> --- a/net/ipv6/ip6_offload.c
> >> +++ b/net/ipv6/ip6_offload.c
> >> @@ -237,6 +237,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
> >>                 proto = ipv6_gso_pull_exthdrs(skb, proto);
> >>                 skb_gro_pull(skb, -skb_transport_offset(skb));
> >>                 skb_reset_transport_header(skb);
> >> +               if (unlikely(skb_headroom(skb) < skb_gro_offset(skb))) {
> >
> > This makes no sense to me.
> >
> > If there is a bug, it should be fixed earlier.
>
> Maybe it is good to validate IPv6 packet earlier in ipv6_gro_receive() or more earlier?
>
> >
> >> +                       kfree_skb(skb);
> >> +                       return ERR_PTR(-EINPROGRESS);
> >> +               }
> >>                 __skb_push(skb, skb_gro_offset(skb));
> >>
> >>                 ops = rcu_dereference(inet6_offloads[proto]);
> >> --
> >> 2.25.1
> >>
> > .
> >
Ziyang Xuan (William) Oct. 28, 2022, 10:11 a.m. UTC | #4
> On Thu, Oct 27, 2022 at 6:01 AM Ziyang Xuan (William)
> <william.xuanziyang@huawei.com> wrote:
>>
>>> On Thu, Oct 27, 2022 at 3:25 AM Ziyang Xuan
>>> <william.xuanziyang@huawei.com> wrote:
>>>>
>>>> IPv6 packets without NEXTHDR_NONE extension header can make continuous
>>>> __skb_pull() until pskb_may_pull() failed in ipv6_gso_pull_exthdrs().
>>>> That results in a big value of skb_gro_offset(), and after __skb_push()
>>>> in ipv6_gro_receive(), skb->data will less than skb->head, an out of
>>>> bounds memory bug occurs. That will trigger the problem as following:
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: use-after-free in eth_type_trans+0x100/0x260
>>>> ...
>>>> Call trace:
>>>>  dump_backtrace+0xd8/0x130
>>>>  show_stack+0x1c/0x50
>>>>  dump_stack_lvl+0x64/0x7c
>>>>  print_address_description.constprop.0+0xbc/0x2e8
>>>>  print_report+0x100/0x1e4
>>>>  kasan_report+0x80/0x120
>>>>  __asan_load8+0x78/0xa0
>>>>  eth_type_trans+0x100/0x260
>>>
>>> Crash happens from eth_type_trans() , this should happen before
>>> ipv6_gro_receive() ?
>>>
>>> It seems your patch is unrelated.
>>>
>>> Please provide a repro.
>>
>> C repro put in attachment.
> 
> This seems to be a bug in tun device.
> 
> Please take more time to root cause this issue, instead of adding work
> arounds all over the place.

Hi Eric,

Thank you for your suggestion.

I have analyzed the problem more deeply. The odd IPv6 packet and
big packet length value(IPv6 payload length more than 65535)
together cause the problem.

skb->network_header and skb->transport_header are all u16 type.
They would occuer overflow errors during ipv6_gro_receive() processing.
That cause the value error for __skb_push(skb, value).

So the problem is a bug in tun device.

I will combine my previous problem "net: tun: limit first seg size to avoid oversized linearization"
together to give the fix patch later.

Thanks.

> 
> Thanks.
> 
>>
>>>
>>>
>>>>  napi_gro_frags+0x164/0x550
>>>>  tun_get_user+0xda4/0x1270
>>>>  tun_chr_write_iter+0x74/0x130
>>>>  do_iter_readv_writev+0x130/0x1ec
>>>>  do_iter_write+0xbc/0x1e0
>>>>  vfs_writev+0x13c/0x26c
>>>>
>>>> Add comparison between skb->data - skb_gro_offset() and skb->head
>>>> and exception handler before __skb_push() to fix the bug.
>>>>
>>>> Fixes: 86911732d399 ("gro: Avoid copying headers of unmerged packets")
>>>> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
>>>> ---
>>>>  net/ipv6/ip6_offload.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
>>>> index 3ee345672849..6659ccf25387 100644
>>>> --- a/net/ipv6/ip6_offload.c
>>>> +++ b/net/ipv6/ip6_offload.c
>>>> @@ -237,6 +237,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
>>>>                 proto = ipv6_gso_pull_exthdrs(skb, proto);
>>>>                 skb_gro_pull(skb, -skb_transport_offset(skb));
>>>>                 skb_reset_transport_header(skb);
>>>> +               if (unlikely(skb_headroom(skb) < skb_gro_offset(skb))) {
>>>
>>> This makes no sense to me.
>>>
>>> If there is a bug, it should be fixed earlier.
>>
>> Maybe it is good to validate IPv6 packet earlier in ipv6_gro_receive() or more earlier?
>>
>>>
>>>> +                       kfree_skb(skb);
>>>> +                       return ERR_PTR(-EINPROGRESS);
>>>> +               }
>>>>                 __skb_push(skb, skb_gro_offset(skb));
>>>>
>>>>                 ops = rcu_dereference(inet6_offloads[proto]);
>>>> --
>>>> 2.25.1
>>>>
>>> .
>>>
> .
>
Eric Dumazet Oct. 28, 2022, 12:40 p.m. UTC | #5
On Fri, Oct 28, 2022 at 3:11 AM Ziyang Xuan (William)
<william.xuanziyang@huawei.com> wrote:
> Hi Eric,
>
> Thank you for your suggestion.
>
> I have analyzed the problem more deeply. The odd IPv6 packet and
> big packet length value(IPv6 payload length more than 65535)
> together cause the problem.
>
> skb->network_header and skb->transport_header are all u16 type.
> They would occuer overflow errors during ipv6_gro_receive() processing.
> That cause the value error for __skb_push(skb, value).
>
> So the problem is a bug in tun device.
>
> I will combine my previous problem "net: tun: limit first seg size to avoid oversized linearization"
> together to give the fix patch later.

SGTM, thanks !
diff mbox series

Patch

diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 3ee345672849..6659ccf25387 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -237,6 +237,10 @@  INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
 		proto = ipv6_gso_pull_exthdrs(skb, proto);
 		skb_gro_pull(skb, -skb_transport_offset(skb));
 		skb_reset_transport_header(skb);
+		if (unlikely(skb_headroom(skb) < skb_gro_offset(skb))) {
+			kfree_skb(skb);
+			return ERR_PTR(-EINPROGRESS);
+		}
 		__skb_push(skb, skb_gro_offset(skb));
 
 		ops = rcu_dereference(inet6_offloads[proto]);