Message ID | 13aed72.61c7.17853a6a5cd.Coremail.linma@zju.edu.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | BUG: Out of bounds read in hci_le_ext_adv_report_evt() | expand |
Hi, On Sun, Mar 21, 2021 at 12:19 AM 马麟 <linma@zju.edu.cn> wrote: > > Hi there: > > Our team, zjublocksec, found the following problem during fuzzing, which seems undiscovered in previous. > > ==== Basic Information ========================= > > HEAD commit: 1e28eed17697bcf343c6743f0028cc3b5dd88bf0 (tag: v5.12-rc3, master) > Kernel config: refer to attached file (config) > C POC code: refer to attached file (poc.c) > > ==== KASAN Output ========================= > > [ 20.294394] BUG: KASAN: slab-out-of-bounds in hci_le_meta_evt+0x310b/0x3850 > [ 20.300333] Read of size 2 at addr ffff888013805819 by task kworker/u5:0/53 > [ 20.306227] > [ 20.307601] CPU: 0 PID: 53 Comm: kworker/u5:0 Not tainted 5.12.0-rc3+ #5 > [ 20.313304] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > [ 20.323006] Workqueue: hci0 hci_rx_work > [ 20.326303] Call Trace: > [ 20.328466] dump_stack+0xdd/0x137 > [ 20.331425] ? hci_le_meta_evt+0x310b/0x3850 > [ 20.335099] ? hci_le_meta_evt+0x310b/0x3850 > [ 20.338773] print_address_description.constprop.0+0x18/0x130 > [ 20.343697] ? hci_le_meta_evt+0x310b/0x3850 > [ 20.347383] ? hci_le_meta_evt+0x310b/0x3850 > [ 20.351059] kasan_report.cold+0x7f/0x111 > [ 20.354512] ? hci_le_meta_evt+0x310b/0x3850 > [ 20.358187] hci_le_meta_evt+0x310b/0x3850 > [ 20.361722] ? run_timer_softirq+0x120/0x120 > [ 20.365402] ? queue_work_on+0x69/0xa0 > [ 20.368654] ? del_timer+0xb6/0x100 > [ 20.371673] ? kasan_set_track+0x1c/0x30 > [ 20.375062] ? le_conn_complete_evt+0x16e0/0x16e0 > [ 20.379092] ? skb_release_data+0x519/0x610 > [ 20.382686] ? kfree+0x91/0x270 > [ 20.385413] ? kasan_set_track+0x1c/0x30 > [ 20.388797] ? mutex_lock+0x89/0xd0 > [ 20.391835] ? __mutex_lock_slowpath+0x10/0x10 > [ 20.395651] ? hci_event_packet+0x436/0xa100 > [ 20.399327] ? bt_dbg+0xe1/0x130 > [ 20.402118] hci_event_packet+0x3213/0xa100 > [ 20.405712] ? _raw_write_lock_irqsave+0xd0/0xd0 > [ 20.409672] ? bt_dbg+0xe1/0x130 > [ 20.412489] ? bt_dbg+0xe1/0x130 > [ 20.415304] ? bt_err_ratelimited+0x140/0x140 > [ 20.419059] ? hci_cmd_status_evt+0x46a0/0x46a0 > [ 20.422955] ? bt_dbg+0xe1/0x130 > [ 20.425754] ? bt_err_ratelimited+0x50/0x140 > [ 20.429429] ? __wake_up_common_lock+0xde/0x130 > [ 20.433333] ? __wake_up_common+0x5d0/0x5d0 > [ 20.436926] ? _raw_spin_lock_irqsave+0x7b/0xd0 > [ 20.440844] ? hci_chan_sent+0x23/0x800 > [ 20.444167] ? __sanitizer_cov_trace_switch+0x50/0x90 > [ 20.448504] ? _raw_spin_lock_irqsave+0x7b/0xd0 > [ 20.452396] ? bt_dbg+0xe1/0x130 > [ 20.455205] ? bt_err_ratelimited+0x140/0x140 > [ 20.458961] ? _raw_spin_lock_irqsave+0x7b/0xd0 > [ 20.462847] ? _raw_write_lock_irqsave+0xd0/0xd0 > [ 20.466832] ? copy_fpregs_to_fpstate+0x14f/0x1d0 > [ 20.470904] hci_rx_work+0x2b9/0x8e0 > [ 20.473993] ? strscpy+0xa0/0x2a0 > [ 20.476905] process_one_work+0x747/0xfe0 > [ 20.480392] ? kthread_data+0x4f/0xc0 > [ 20.483561] worker_thread+0x641/0x1190 > [ 20.486883] ? rescuer_thread+0xd00/0xd00 > [ 20.490332] kthread+0x344/0x410 > [ 20.493127] ? kthread_create_worker_on_cpu+0xf0/0xf0 > [ 20.497457] ret_from_fork+0x22/0x30 > [ 20.500563] > [ 20.501919] Allocated by task 223: > [ 20.504882] kasan_save_stack+0x1b/0x40 > [ 20.508212] __kasan_kmalloc+0x7a/0x90 > [ 20.511459] load_elf_phdrs+0x103/0x210 > [ 20.514763] load_elf_binary+0x1dc/0x4dd0 > [ 20.518220] bprm_execve+0x741/0x1460 > [ 20.521401] do_execveat_common+0x621/0x7c0 > [ 20.525013] __x64_sys_execve+0x8f/0xc0 > [ 20.528354] do_syscall_64+0x33/0x40 > [ 20.531465] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 20.535791] > [ 20.537154] The buggy address belongs to the object at ffff888013805600 > [ 20.537154] which belongs to the cache kmalloc-512 of size 512 > [ 20.547717] The buggy address is located 25 bytes to the right of > [ 20.547717] 512-byte region [ffff888013805600, ffff888013805800) > [ 20.557963] The buggy address belongs to the page: > [ 20.562066] page:00000000ef0b1214 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888013802000 pfn:0x13800 > [ 20.571028] head:00000000ef0b1214 order:3 compound_mapcount:0 compound_pincount:0 > [ 20.577371] flags: 0x100000000010200(slab|head) > [ 20.581264] raw: 0100000000010200 ffff888006441450 ffffea0000473408 ffff888006443940 > [ 20.587833] raw: ffff888013802000 000000000015000c 00000001ffffffff 0000000000000000 > [ 20.594388] page dumped because: kasan: bad access detected > [ 20.599157] > [ 20.600503] Memory state around the buggy address: > [ 20.604598] ffff888013805700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 20.610722] ffff888013805780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > [ 20.616835] >ffff888013805800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 20.622963] ^ > [ 20.626401] ffff888013805880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > [ 20.632507] ffff888013805900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > > ==== Bug Analysis ========================= > > In fact, this out-of-bounds read is quite similar to an old found bug (KASAN: out-of-bounds read in hci_le_direct_adv_report_evt). You can check this link to get useful information: https://groups.google.com/g/syzkaller-bugs/c/Z9-x9udEIxk/m/0NsClcU4BAAJ > > Anyhow, the buggy code for this time is shown below: > > static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) > { > u8 num_reports = skb->data[0]; > void *ptr = &skb->data[1]; > > hci_dev_lock(hdev); > > while (num_reports--) { > struct hci_ev_le_ext_adv_report *ev = ptr; > u8 legacy_evt_type; > u16 evt_type; > > evt_type = __le16_to_cpu(ev->evt_type); > legacy_evt_type = ext_evt_type_to_legacy(hdev, evt_type); > if (legacy_evt_type != LE_ADV_INVALID) { > process_adv_report(hdev, legacy_evt_type, &ev->bdaddr, > ev->bdaddr_type, NULL, 0, ev->rssi, > ev->data, ev->length, > !(evt_type & LE_EXT_ADV_LEGACY_PDU)); > } > > ptr += sizeof(*ev) + ev->length; > } > > hci_dev_unlock(hdev); > } > > As you can see, the variable `num_reports` is not being properly checked. The malformed event packet can fake a huge `num_reports` and cause `process_adv_report` to access invalid memory space. Yeah, the internal of this bug is almost equivalent to the already found bug. > > ==== Suggested Patch ========================= > > As this bug is quite similar to that found one, it's recommended to adopt a similar patch here like below (also in the attached file: patch.diff). > > --- a/net/bluetooth/hci_event.c > +++ b/net/bluetooth/hci_event.c > @@ -5685,10 +5685,14 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) > { > u8 num_reports = skb->data[0]; > void *ptr = &skb->data[1]; > + u32 len_processed = 0; > > hci_dev_lock(hdev); > > while (num_reports--) { > + if (len_processed > skb->len) > + break; > + > struct hci_ev_le_ext_adv_report *ev = ptr; > u8 legacy_evt_type; > u16 evt_type; > @@ -5703,6 +5707,7 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) > } > > ptr += sizeof(*ev) + ev->length; > + len_processed += sizeof(*ev) + ev->length; > } > > hci_dev_unlock(hdev); > > The idea here is just to prevent the `ptr` to go over bound of the `skb->len`. After testing, the reproducer code will not work out against this fix. :) Or we do something like https://lore.kernel.org/linux-bluetooth/20201024002251.1389267-1-luiz.dentz@gmail.com/, that said the reason we didn't applied my patches was that the controller would be the one generating invalid data, but it seems you are reproducing with vhci controller which is only used for emulating a controller and requires root privileges so it is unlikely these conditions would happens with hardware itself, in the other hand as there seems to be more and more reports using vhci to emulate broken events it perhaps more productive to introduce proper checks for all events so we don't have to deal with more reports like this in the future. > ==== Others ========================= > > Please let me know if there is any confuses. > Best wishes!
Hi, Den mån 22 mars 2021 kl 00:01 skrev Luiz Augusto von Dentz <luiz.dentz@gmail.com>: > Or we do something like > https://lore.kernel.org/linux-bluetooth/20201024002251.1389267-1-luiz.dentz@gmail.com/, > that said the reason we didn't applied my patches was that the > controller would be the one generating invalid data, but it seems you > are reproducing with vhci controller which is only used for emulating > a controller and requires root privileges so it is unlikely these > conditions would happens with hardware itself, in the other hand as > there seems to be more and more reports using vhci to emulate broken > events it perhaps more productive to introduce proper checks for all > events so we don't have to deal with more reports like this in the > future. Keep in mind that when using the H4 uart protocol without any error correction (as H5 has), it is possible that random bit errors occur on the wire. I wouldn't like my kernel to crash due to this. Bit errors happen all the time on RPi 4 for example at the default baud rate if you just do some heavy stress testing, or use an application that transfers a lot of data over Bluetooth. /Emil
Hi Emil, On Sun, Mar 21, 2021 at 4:23 PM Emil Lenngren <emil.lenngren@gmail.com> wrote: > > Hi, > > Den mån 22 mars 2021 kl 00:01 skrev Luiz Augusto von Dentz > <luiz.dentz@gmail.com>: > > Or we do something like > > https://lore.kernel.org/linux-bluetooth/20201024002251.1389267-1-luiz.dentz@gmail.com/, > > that said the reason we didn't applied my patches was that the > > controller would be the one generating invalid data, but it seems you > > are reproducing with vhci controller which is only used for emulating > > a controller and requires root privileges so it is unlikely these > > conditions would happens with hardware itself, in the other hand as > > there seems to be more and more reports using vhci to emulate broken > > events it perhaps more productive to introduce proper checks for all > > events so we don't have to deal with more reports like this in the > > future. > > Keep in mind that when using the H4 uart protocol without any error > correction (as H5 has), it is possible that random bit errors occur on > the wire. I wouldn't like my kernel to crash due to this. Bit errors > happen all the time on RPi 4 for example at the default baud rate if > you just do some heavy stress testing, or use an application that > transfers a lot of data over Bluetooth. While we can catch some errors like that, and possible avoid crashes, this should be limited to just boundary checks and not actually error correction, that I'm afraid is out of our hands since we can still receive an event that does match the original packet size but meant something else which may break the synchronization of the states between the controller and the host, also perhaps we need to notify this type of error since even if we start discarding the events that can possible cause states to be out of sync and the controller will need to be reset in order to recover.
--- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -5685,10 +5685,14 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) { u8 num_reports = skb->data[0]; void *ptr = &skb->data[1]; + u32 len_processed = 0; hci_dev_lock(hdev); while (num_reports--) { + if (len_processed > skb->len) + break; + struct hci_ev_le_ext_adv_report *ev = ptr; u8 legacy_evt_type; u16 evt_type; @@ -5703,6 +5707,7 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) } ptr += sizeof(*ev) + ev->length; + len_processed += sizeof(*ev) + ev->length; } hci_dev_unlock(hdev);