Message ID | 20200929183513.380760-1-alex.popov@linux.com (mailing list archive) |
---|---|
Headers | show |
Series | Break heap spraying needed for exploiting use-after-free | expand |
Hello! I have some performance numbers. Please see below. On 29.09.2020 21:35, Alexander Popov wrote: > Hello everyone! Requesting for your comments. > > This is the second version of the heap quarantine prototype for the Linux > kernel. I performed a deeper evaluation of its security properties and > developed new features like quarantine randomization and integration with > init_on_free. That is fun! See below for more details. > > > Rationale > ========= > > Use-after-free vulnerabilities in the Linux kernel are very popular for > exploitation. There are many examples, some of them: > https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html > https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1 > https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html > > Use-after-free exploits usually employ heap spraying technique. > Generally it aims to put controlled bytes at a predetermined memory > location on the heap. > > Heap spraying for exploiting use-after-free in the Linux kernel relies on > the fact that on kmalloc(), the slab allocator returns the address of > the memory that was recently freed. So allocating a kernel object with > the same size and controlled contents allows overwriting the vulnerable > freed object. > > I've found an easy way to break the heap spraying for use-after-free > exploitation. I extracted slab freelist quarantine from KASAN functionality > and called it CONFIG_SLAB_QUARANTINE. Please see patch 1/6. > > If this feature is enabled, freed allocations are stored in the quarantine > queue where they wait for actual freeing. So they can't be instantly > reallocated and overwritten by use-after-free exploits. > > N.B. Heap spraying for out-of-bounds exploitation is another technique, > heap quarantine doesn't break it. > > > Security properties > =================== > > For researching security properties of the heap quarantine I developed 2 lkdtm > tests (see the patch 5/6). > > The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object > from a separate kmem_cache and then allocates 400000 similar objects. > I.e. this test performs an original heap spraying technique for use-after-free > exploitation. > > If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly > reallocated and overwritten: > # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT > lkdtm: Performing direct entry HEAP_SPRAY > lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333 > lkdtm: Original heap spraying: allocate 400000 objects of size 333... > lkdtm: FAIL: attempt 0: freed object is reallocated > > If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite > the freed object: > # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT > lkdtm: Performing direct entry HEAP_SPRAY > lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333 > lkdtm: Original heap spraying: allocate 400000 objects of size 333... > lkdtm: OK: original heap spraying hasn't succeed > > That happens because pushing an object through the quarantine requires _both_ > allocating and freeing memory. Objects are released from the quarantine on > new memory allocations, but only when the quarantine size is over the limit. > And the quarantine size grows on new memory freeing. > > That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE. > It allocates and frees an object from a separate kmem_cache and then performs > kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times. > This test effectively pushes the object through the heap quarantine and > reallocates it after it returns back to the allocator freelist: > # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/ > lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE > lkdtm: Allocated and freed spray_cache object 000000008fdb15c3 of size 333 > lkdtm: Push through quarantine: allocate and free 400000 objects of size 333... > lkdtm: Target object is reallocated at attempt 182994 > # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/ > lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE > lkdtm: Allocated and freed spray_cache object 000000004e223cbe of size 333 > lkdtm: Push through quarantine: allocate and free 400000 objects of size 333... > lkdtm: Target object is reallocated at attempt 186830 > # echo PUSH_THROUGH_QUARANTINE > /sys/kernel/debug/provoke-crash/ > lkdtm: Performing direct entry PUSH_THROUGH_QUARANTINE > lkdtm: Allocated and freed spray_cache object 000000007663a058 of size 333 > lkdtm: Push through quarantine: allocate and free 400000 objects of size 333... > lkdtm: Target object is reallocated at attempt 182010 > > As you can see, the number of the allocations that are needed for overwriting > the vulnerable object is almost the same. That would be good for stable > use-after-free exploitation and should not be allowed. > That's why I developed the quarantine randomization (see the patch 4/6). > > This randomization required very small hackish changes of the heap quarantine > mechanism. At first all quarantine batches are filled by objects. Then during > the quarantine reducing I randomly choose and free 1/2 of objects from a > randomly chosen batch. Now the randomized quarantine releases the freed object > at an unpredictable moment: > lkdtm: Target object is reallocated at attempt 107884 > lkdtm: Target object is reallocated at attempt 265641 > lkdtm: Target object is reallocated at attempt 100030 > lkdtm: Target object is NOT reallocated in 400000 attempts > lkdtm: Target object is reallocated at attempt 204731 > lkdtm: Target object is reallocated at attempt 359333 > lkdtm: Target object is reallocated at attempt 289349 > lkdtm: Target object is reallocated at attempt 119893 > lkdtm: Target object is reallocated at attempt 225202 > lkdtm: Target object is reallocated at attempt 87343 > > However, this randomization alone would not disturb the attacker, because > the quarantine stores the attacker's data (the payload) in the sprayed objects. > I.e. the reallocated and overwritten vulnerable object contains the payload > until the next reallocation (very bad). > > Hence heap objects should be erased before going to the heap quarantine. > Moreover, filling them by zeros gives a chance to detect use-after-free > accesses to non-zero data while an object stays in the quarantine (nice!). > That functionality already exists in the kernel, it's called init_on_free. > I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6. > > During that work I found a bug: in CONFIG_SLAB init_on_free happens too > late, and heap objects go to the KASAN quarantine being dirty. See the fix > in the patch 2/6. > > For deeper understanding of the heap quarantine inner workings, I attach > the patch 6/6, which contains verbose debugging (not for merge). > It's very helpful, see the output example: > quarantine: PUT 508992 to tail batch 123, whole sz 65118872, batch sz 508854 > quarantine: whole sz exceed max by 494552, REDUCE head batch 0 by 415392, leave 396304 > quarantine: data level in batches: > 0 - 77% > 1 - 108% > 2 - 83% > 3 - 21% > ... > 125 - 75% > 126 - 12% > 127 - 108% > quarantine: whole sz exceed max by 79160, REDUCE head batch 12 by 14160, leave 17608 > quarantine: whole sz exceed max by 65000, REDUCE head batch 75 by 218328, leave 195232 > quarantine: PUT 508992 to tail batch 124, whole sz 64979984, batch sz 508854 > ... > > > Changes in v2 > ============= > > - Added heap quarantine randomization (the patch 4/6). > > - Integrated CONFIG_SLAB_QUARANTINE with init_on_free (the patch 3/6). > > - Fixed late init_on_free in CONFIG_SLAB (the patch 2/6). > > - Added lkdtm_PUSH_THROUGH_QUARANTINE test. > > - Added the quarantine verbose debugging (the patch 6/6, not for merge). > > - Improved the descriptions according to the feedback from Kees Cook > and Matthew Wilcox. > > - Made fixes recommended by Kees Cook: > > * Avoided BUG_ON() in kasan_cache_create() by handling the error and > reporting with WARN_ON(). > > * Created a separate kmem_cache for new lkdtm tests. > > * Fixed kasan_track.pid type to pid_t. > > > TODO for the next prototypes > ============================ > > 1. Performance evaluation and optimization. > I would really appreciate your ideas about performance testing of a > kernel with the heap quarantine. The first prototype was tested with > hackbench and kernel build timing (which showed very different numbers). > Earlier the developers similarly tested init_on_free functionality. > However, Brad Spengler says in his twitter that such testing method > is poor. I've made various tests on real hardware and in virtual machines: 1) network throughput test using iperf server: iperf -s -f K client: iperf -c 127.0.0.1 -t 60 -f K 2) scheduler stress test hackbench -s 4000 -l 500 -g 15 -f 25 -P 3) building the defconfig kernel time make -j2 I compared Linux kernel 5.9.0-rc6 with: - init_on_free=off, - init_on_free=on, - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free). Each test was performed 5 times. I will show the mean values. If you are interested, I can share all the results and calculate standard deviation. Real hardware, Intel Core i7-6500U CPU 1) Network throughput test with iperf init_on_free=off: 5467152.2 KBytes/sec init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off) CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on) 2) Scheduler stress test with hackbench init_on_free=off: 8.5364s init_on_free=on: 8.9858s (+5.3% vs init_on_free=off) CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on) 3) Building the defconfig kernel: init_on_free=off: 10m54.475s init_on_free=on: 11m5.745s (+1.7% vs init_on_free=off) CONFIG_SLAB_QUARANTINE: 11m13.291s (+1.1% vs init_on_free=on) Virtual machine, QEMU/KVM 1) Network throughput test with iperf init_on_free=off: 3554237.4 KBytes/sec init_on_free=on: 2828887.4 KBytes/sec (-20.4% vs init_on_free=off) CONFIG_SLAB_QUARANTINE: 2587308.2 KBytes/sec (-8.5% vs init_on_free=on) 2) Scheduler stress test with hackbench init_on_free=off: 19.3602s init_on_free=on: 20.8854s (+7.9% vs init_on_free=off) CONFIG_SLAB_QUARANTINE: 30.0746s (+44.0% vs init_on_free=on) We can see that the results of these tests are quite diverse. Your interpretation of the results and ideas of other tests are welcome. N.B. There was NO performance optimization made for this version of the heap quarantine prototype. The main effort was put into researching its security properties (hope for your feedback). Performance optimization will be done in further steps, if we see that my work is worth doing. > 2. Complete separation of CONFIG_SLAB_QUARANTINE from KASAN (feedback > from Andrey Konovalov). > > 3. Adding a kernel boot parameter for enabling/disabling the heap quaranitne > (feedback from Kees Cook). > > 4. Testing the heap quarantine in near-OOM situations (feedback from > Pavel Machek). > > 5. Does this work somehow help or disturb the integration of the > Memory Tagging for the Linux kernel? > > 6. After rebasing the series onto v5.9.0-rc6, CONFIG_SLAB kernel started to > show warnings about few slab caches that have no space for additional > metadata. It needs more investigation. I believe it affects KASAN bug > detection abilities as well. Warning example: > WARNING: CPU: 0 PID: 0 at mm/kasan/slab_quarantine.c:38 kasan_cache_create+0x37/0x50 > Modules linked in: > CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc6+ #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 > RIP: 0010:kasan_cache_create+0x37/0x50 > ... > Call Trace: > __kmem_cache_create+0x74/0x250 > create_boot_cache+0x6d/0x91 > create_kmalloc_cache+0x57/0x93 > new_kmalloc_cache+0x39/0x47 > create_kmalloc_caches+0x33/0xd9 > start_kernel+0x25b/0x532 > secondary_startup_64+0xb6/0xc0 > > Thanks in advance for your feedback. > Best regards, > Alexander > > > Alexander Popov (6): > mm: Extract SLAB_QUARANTINE from KASAN > mm/slab: Perform init_on_free earlier > mm: Integrate SLAB_QUARANTINE with init_on_free > mm: Implement slab quarantine randomization > lkdtm: Add heap quarantine tests > mm: Add heap quarantine verbose debugging (not for merge) > > drivers/misc/lkdtm/core.c | 2 + > drivers/misc/lkdtm/heap.c | 110 +++++++++++++++++++++++++++++++++++++ > drivers/misc/lkdtm/lkdtm.h | 2 + > include/linux/kasan.h | 107 ++++++++++++++++++++---------------- > include/linux/slab_def.h | 2 +- > include/linux/slub_def.h | 2 +- > init/Kconfig | 14 +++++ > mm/Makefile | 3 +- > mm/kasan/Makefile | 2 + > mm/kasan/kasan.h | 75 +++++++++++++------------ > mm/kasan/quarantine.c | 102 ++++++++++++++++++++++++++++++---- > mm/kasan/slab_quarantine.c | 106 +++++++++++++++++++++++++++++++++++ > mm/page_alloc.c | 22 ++++++++ > mm/slab.c | 5 +- > mm/slub.c | 2 +- > 15 files changed, 455 insertions(+), 101 deletions(-) > create mode 100644 mm/kasan/slab_quarantine.c >
On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote: > On 29.09.2020 21:35, Alexander Popov wrote: > > This is the second version of the heap quarantine prototype for the Linux > > kernel. I performed a deeper evaluation of its security properties and > > developed new features like quarantine randomization and integration with > > init_on_free. That is fun! See below for more details. > > > > > > Rationale > > ========= > > > > Use-after-free vulnerabilities in the Linux kernel are very popular for > > exploitation. There are many examples, some of them: > > https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html I don't think your proposed mitigation would work with much reliability against this bug; the attacker has full control over the timing of the original use and the following use, so an attacker should be able to trigger the kmem_cache_free(), then spam enough new VMAs and delete them to flush out the quarantine, and then do heap spraying as normal, or something like that. Also, note that here, if the reallocation fails, the kernel still wouldn't crash because the dangling object is not accessed further if the address range stored in it doesn't match the fault address. So an attacker could potentially try multiple times, and if the object happens to be on the quarantine the first time, that wouldn't really be a showstopper, you'd just try again. > > https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1 I think that here, again, the free() and the dangling pointer use were caused by separate syscalls, meaning the attacker had control over that timing? > > https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html Haven't looked at that one in detail. > > Use-after-free exploits usually employ heap spraying technique. > > Generally it aims to put controlled bytes at a predetermined memory > > location on the heap. Well, not necessarily "predetermined". Depending on the circumstances, you don't necessarily need to know which address you're writing to; and you might not even need to overwrite a specific object, but instead just have to overwrite one out of a bunch of objects, no matter which. > > Heap spraying for exploiting use-after-free in the Linux kernel relies on > > the fact that on kmalloc(), the slab allocator returns the address of > > the memory that was recently freed. Yeah; and that behavior is pretty critical for performance. The longer it's been since a newly allocated object was freed, the higher the chance that you'll end up having to go further down the memory cache hierarchy. > > So allocating a kernel object with > > the same size and controlled contents allows overwriting the vulnerable > > freed object. The vmacache exploit you linked to doesn't do that, it frees the object all the way back to the page allocator and then sprays 4MiB of memory from the page allocator. (Because VMAs use their own kmem_cache, and the kmem_cache wasn't merged with any interesting ones, and I saw no good way to exploit the bug by reallocating another VMA over the old VMA back then. Although of course that doesn't mean that there is no such way.) [...] > > Security properties > > =================== > > > > For researching security properties of the heap quarantine I developed 2 lkdtm > > tests (see the patch 5/6). > > > > The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object > > from a separate kmem_cache and then allocates 400000 similar objects. > > I.e. this test performs an original heap spraying technique for use-after-free > > exploitation. > > > > If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly > > reallocated and overwritten: > > # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT > > lkdtm: Performing direct entry HEAP_SPRAY > > lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333 > > lkdtm: Original heap spraying: allocate 400000 objects of size 333... > > lkdtm: FAIL: attempt 0: freed object is reallocated > > > > If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite > > the freed object: > > # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT > > lkdtm: Performing direct entry HEAP_SPRAY > > lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333 > > lkdtm: Original heap spraying: allocate 400000 objects of size 333... > > lkdtm: OK: original heap spraying hasn't succeed > > > > That happens because pushing an object through the quarantine requires _both_ > > allocating and freeing memory. Objects are released from the quarantine on > > new memory allocations, but only when the quarantine size is over the limit. > > And the quarantine size grows on new memory freeing. > > > > That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE. > > It allocates and frees an object from a separate kmem_cache and then performs > > kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times. > > This test effectively pushes the object through the heap quarantine and > > reallocates it after it returns back to the allocator freelist: [...] > > As you can see, the number of the allocations that are needed for overwriting > > the vulnerable object is almost the same. That would be good for stable > > use-after-free exploitation and should not be allowed. > > That's why I developed the quarantine randomization (see the patch 4/6). > > > > This randomization required very small hackish changes of the heap quarantine > > mechanism. At first all quarantine batches are filled by objects. Then during > > the quarantine reducing I randomly choose and free 1/2 of objects from a > > randomly chosen batch. Now the randomized quarantine releases the freed object > > at an unpredictable moment: > > lkdtm: Target object is reallocated at attempt 107884 [...] > > lkdtm: Target object is reallocated at attempt 87343 Those numbers are fairly big. At that point you might not even fit into L3 cache anymore, right? You'd often be hitting DRAM for new allocations? And for many slabs, you might end using much more memory for the quarantine than for actual in-use allocations. It seems to me like, for this to stop attacks with a high probability, you'd have to reserve a huge chunk of kernel memory for the quarantines - even if the attacker doesn't know anything about the status of the quarantine (which isn't necessarily the case, depending on whether the attacker can abuse microarchitectural data leakage, or if the attacker can trigger a pure data read through the dangling pointer), they should still be able to win with a probability around quarantine_size/allocated_memory_size if they have a heap spraying primitive without strict limits. > > However, this randomization alone would not disturb the attacker, because > > the quarantine stores the attacker's data (the payload) in the sprayed objects. > > I.e. the reallocated and overwritten vulnerable object contains the payload > > until the next reallocation (very bad). > > > > Hence heap objects should be erased before going to the heap quarantine. > > Moreover, filling them by zeros gives a chance to detect use-after-free > > accesses to non-zero data while an object stays in the quarantine (nice!). > > That functionality already exists in the kernel, it's called init_on_free. > > I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6. > > > > During that work I found a bug: in CONFIG_SLAB init_on_free happens too > > late, and heap objects go to the KASAN quarantine being dirty. See the fix > > in the patch 2/6. [...] > I've made various tests on real hardware and in virtual machines: > 1) network throughput test using iperf > server: iperf -s -f K > client: iperf -c 127.0.0.1 -t 60 -f K > 2) scheduler stress test > hackbench -s 4000 -l 500 -g 15 -f 25 -P > 3) building the defconfig kernel > time make -j2 > > I compared Linux kernel 5.9.0-rc6 with: > - init_on_free=off, > - init_on_free=on, > - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free). > > Each test was performed 5 times. I will show the mean values. > If you are interested, I can share all the results and calculate standard deviation. > > Real hardware, Intel Core i7-6500U CPU > 1) Network throughput test with iperf > init_on_free=off: 5467152.2 KBytes/sec > init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off) > CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on) > 2) Scheduler stress test with hackbench > init_on_free=off: 8.5364s > init_on_free=on: 8.9858s (+5.3% vs init_on_free=off) > CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on) These numbers seem really high for a mitigation, especially if that performance hit does not really buy you deterministic protection against many bugs. [...] > N.B. There was NO performance optimization made for this version of the heap > quarantine prototype. The main effort was put into researching its security > properties (hope for your feedback). Performance optimization will be done in > further steps, if we see that my work is worth doing. But you are pretty much inherently limited in terms of performance by the effect the quarantine has on the data cache, right? It seems to me like, if you want to make UAF exploitation harder at the heap allocator layer, you could do somewhat more effective things with a probably much smaller performance budget. Things like preventing the reallocation of virtual kernel addresses with different types, such that an attacker can only replace a UAF object with another object of the same type. (That is not an idea I like very much either, but I would like it more than this proposal.) (E.g. some browsers implement things along those lines, I believe.)
On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote: > It seems to me like, if you want to make UAF exploitation harder at > the heap allocator layer, you could do somewhat more effective things > with a probably much smaller performance budget. Things like > preventing the reallocation of virtual kernel addresses with different > types, such that an attacker can only replace a UAF object with > another object of the same type. (That is not an idea I like very much > either, but I would like it more than this proposal.) (E.g. some > browsers implement things along those lines, I believe.) The slab allocator already has that functionality. We call it TYPESAFE_BY_RCU, but if forcing that on by default would enhance security by a measurable amount, it wouldn't be a terribly hard sell ...
On Tue, Oct 6, 2020 at 2:44 AM Matthew Wilcox <willy@infradead.org> wrote: > On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote: > > It seems to me like, if you want to make UAF exploitation harder at > > the heap allocator layer, you could do somewhat more effective things > > with a probably much smaller performance budget. Things like > > preventing the reallocation of virtual kernel addresses with different > > types, such that an attacker can only replace a UAF object with > > another object of the same type. (That is not an idea I like very much > > either, but I would like it more than this proposal.) (E.g. some > > browsers implement things along those lines, I believe.) > > The slab allocator already has that functionality. We call it > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security > by a measurable amount, it wouldn't be a terribly hard sell ... TYPESAFE_BY_RCU just forces an RCU grace period before the reallocation; I'm thinking of something more drastic, like completely refusing to give back the memory, or using vmalloc for slabs where that's safe (reusing physical but not virtual addresses across types). And, to make it more effective, something like a compiler plugin to isolate kmalloc(sizeof(<type>)) allocations by type beyond just size classes.
On Tue, Oct 06, 2020 at 01:44:14AM +0100, Matthew Wilcox wrote: > On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote: > > It seems to me like, if you want to make UAF exploitation harder at > > the heap allocator layer, you could do somewhat more effective things > > with a probably much smaller performance budget. Things like > > preventing the reallocation of virtual kernel addresses with different > > types, such that an attacker can only replace a UAF object with > > another object of the same type. (That is not an idea I like very much > > either, but I would like it more than this proposal.) (E.g. some > > browsers implement things along those lines, I believe.) > > The slab allocator already has that functionality. We call it > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security > by a measurable amount, it wouldn't be a terribly hard sell ... Isn't the "easy" version of this already controlled by slab_merge? (i.e. do not share same-sized/flagged kmem_caches between different caches) The large trouble are the kmalloc caches, which don't have types associated with them. Having implicit kmem caches based on the type being allocated there would need some pretty extensive plumbing, I think?
On Tue, Oct 6, 2020 at 4:09 AM Kees Cook <keescook@chromium.org> wrote: > On Tue, Oct 06, 2020 at 01:44:14AM +0100, Matthew Wilcox wrote: > > On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote: > > > It seems to me like, if you want to make UAF exploitation harder at > > > the heap allocator layer, you could do somewhat more effective things > > > with a probably much smaller performance budget. Things like > > > preventing the reallocation of virtual kernel addresses with different > > > types, such that an attacker can only replace a UAF object with > > > another object of the same type. (That is not an idea I like very much > > > either, but I would like it more than this proposal.) (E.g. some > > > browsers implement things along those lines, I believe.) > > > > The slab allocator already has that functionality. We call it > > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security > > by a measurable amount, it wouldn't be a terribly hard sell ... > > Isn't the "easy" version of this already controlled by slab_merge? (i.e. > do not share same-sized/flagged kmem_caches between different caches) Yes, but slab_merge still normally frees slab pages to the page allocator. > The large trouble are the kmalloc caches, which don't have types > associated with them. Having implicit kmem caches based on the type > being allocated there would need some pretty extensive plumbing, I > think? Well, a bit of plumbing, at least. You'd need to teach the compiler frontend to grab type names from sizeof() and stuff that type information somewhere, e.g. by generating an extra function argument referring to the type, or something like that. Could be as simple as a reference to a bss section variable that encodes the type in the name, and the linker already has the logic to automatically deduplicate those across compilation units - that way, on the compiler side, a pure frontend plugin might do the job?
It will reuse the memory for other things when the whole slab is freed though. Not really realistic to change that without it being backed by virtual memory along with higher-level management of regions to avoid intense fragmentation and metadata waste. It would depend a lot on having much finer-grained slab caches, otherwise it's not going to be much of an alternative to a quarantine feature. Even then, a quarantine feature is still useful, but is less suitable for a mainstream feature due to performance cost. Even a small quarantine has a fairly high performance cost.
On Tue, 6 Oct 2020, Matthew Wilcox wrote: > On Tue, Oct 06, 2020 at 12:56:33AM +0200, Jann Horn wrote: > > It seems to me like, if you want to make UAF exploitation harder at > > the heap allocator layer, you could do somewhat more effective things > > with a probably much smaller performance budget. Things like > > preventing the reallocation of virtual kernel addresses with different > > types, such that an attacker can only replace a UAF object with > > another object of the same type. (That is not an idea I like very much > > either, but I would like it more than this proposal.) (E.g. some > > browsers implement things along those lines, I believe.) > > The slab allocator already has that functionality. We call it > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security > by a measurable amount, it wouldn't be a terribly hard sell ... TYPESAFE functionality switches a lot of debugging off because that also allows speculative accesses to the object after it was freed (requires for RCU safeness because the object may be freed in an RCU period where it is still accessed). I do not think you would like that.
On Mon, 5 Oct 2020, Kees Cook wrote: > > TYPESAFE_BY_RCU, but if forcing that on by default would enhance security > > by a measurable amount, it wouldn't be a terribly hard sell ... > > Isn't the "easy" version of this already controlled by slab_merge? (i.e. > do not share same-sized/flagged kmem_caches between different caches) Right. > The large trouble are the kmalloc caches, which don't have types > associated with them. Having implicit kmem caches based on the type > being allocated there would need some pretty extensive plumbing, I > think? Actually typifying those accesses may get rid of a lot of kmalloc allocations and could help to ease the management and control of objects. It may be a big task though given the ubiquity of kmalloc and the need to create a massive amount of new slab caches. This is going to reduce the cache hit rate significantly.
On 06.10.2020 01:56, Jann Horn wrote: > On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote: >> On 29.09.2020 21:35, Alexander Popov wrote: >>> This is the second version of the heap quarantine prototype for the Linux >>> kernel. I performed a deeper evaluation of its security properties and >>> developed new features like quarantine randomization and integration with >>> init_on_free. That is fun! See below for more details. >>> >>> >>> Rationale >>> ========= >>> >>> Use-after-free vulnerabilities in the Linux kernel are very popular for >>> exploitation. There are many examples, some of them: >>> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html Hello Jann, thanks for your reply. > I don't think your proposed mitigation would work with much > reliability against this bug; the attacker has full control over the > timing of the original use and the following use, so an attacker > should be able to trigger the kmem_cache_free(), then spam enough new > VMAs and delete them to flush out the quarantine, and then do heap > spraying as normal, or something like that. The randomized quarantine will release the vulnerable object at an unpredictable moment (patch 4/6). So I think the control over the time of the use-after-free access doesn't help attackers, if they don't have an "infinite spray" -- unlimited ability to store controlled data in the kernelspace objects of the needed size without freeing them. "Unlimited", because the quarantine size is 1/32 of whole memory. "Without freeing", because freed objects are erased by init_on_free before going to randomized heap quarantine (patch 3/6). Would you agree? > Also, note that here, if the reallocation fails, the kernel still > wouldn't crash because the dangling object is not accessed further if > the address range stored in it doesn't match the fault address. So an > attacker could potentially try multiple times, and if the object > happens to be on the quarantine the first time, that wouldn't really > be a showstopper, you'd just try again. Freed objects are filled by zero before going to quarantine (patch 3/6). Would it cause a null pointer dereference on unsuccessful try? >>> https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html?m=1 > > I think that here, again, the free() and the dangling pointer use were > caused by separate syscalls, meaning the attacker had control over > that timing? As I wrote above, I think attacker's control over this timing is required for a successful attack, but is not enough for bypassing randomized quarantine. >>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html > > Haven't looked at that one in detail. > >>> Use-after-free exploits usually employ heap spraying technique. >>> Generally it aims to put controlled bytes at a predetermined memory >>> location on the heap. > > Well, not necessarily "predetermined". Depending on the circumstances, > you don't necessarily need to know which address you're writing to; > and you might not even need to overwrite a specific object, but > instead just have to overwrite one out of a bunch of objects, no > matter which. Yes, of course, I didn't mean a "predetermined memory address". Maybe "definite memory location" is a better phrase for that. >>> Heap spraying for exploiting use-after-free in the Linux kernel relies on >>> the fact that on kmalloc(), the slab allocator returns the address of >>> the memory that was recently freed. > > Yeah; and that behavior is pretty critical for performance. The longer > it's been since a newly allocated object was freed, the higher the > chance that you'll end up having to go further down the memory cache > hierarchy. Yes. That behaviour is fast, however very convenient for use-after-free exploitation... >>> So allocating a kernel object with >>> the same size and controlled contents allows overwriting the vulnerable >>> freed object. > > The vmacache exploit you linked to doesn't do that, it frees the > object all the way back to the page allocator and then sprays 4MiB of > memory from the page allocator. (Because VMAs use their own > kmem_cache, and the kmem_cache wasn't merged with any interesting > ones, and I saw no good way to exploit the bug by reallocating another > VMA over the old VMA back then. Although of course that doesn't mean > that there is no such way.) Sorry, my mistake. Exploit examples with heap spraying that fit my description: - CVE-2017-6074 https://www.openwall.com/lists/oss-security/2017/02/26/2 - CVE-2017-2636 https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html - CVE-2016-8655 https://seclists.org/oss-sec/2016/q4/607 - CVE-2017-15649 https://ssd-disclosure.com/ssd-advisory-linux-kernel-af_packet-use-after-free/ > [...] >>> Security properties >>> =================== >>> >>> For researching security properties of the heap quarantine I developed 2 lkdtm >>> tests (see the patch 5/6). >>> >>> The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object >>> from a separate kmem_cache and then allocates 400000 similar objects. >>> I.e. this test performs an original heap spraying technique for use-after-free >>> exploitation. >>> >>> If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly >>> reallocated and overwritten: >>> # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT >>> lkdtm: Performing direct entry HEAP_SPRAY >>> lkdtm: Allocated and freed spray_cache object 000000002b5b3ad4 of size 333 >>> lkdtm: Original heap spraying: allocate 400000 objects of size 333... >>> lkdtm: FAIL: attempt 0: freed object is reallocated >>> >>> If CONFIG_SLAB_QUARANTINE is enabled, 400000 new allocations don't overwrite >>> the freed object: >>> # echo HEAP_SPRAY > /sys/kernel/debug/provoke-crash/DIRECT >>> lkdtm: Performing direct entry HEAP_SPRAY >>> lkdtm: Allocated and freed spray_cache object 000000009909e777 of size 333 >>> lkdtm: Original heap spraying: allocate 400000 objects of size 333... >>> lkdtm: OK: original heap spraying hasn't succeed >>> >>> That happens because pushing an object through the quarantine requires _both_ >>> allocating and freeing memory. Objects are released from the quarantine on >>> new memory allocations, but only when the quarantine size is over the limit. >>> And the quarantine size grows on new memory freeing. >>> >>> That's why I created the second test called lkdtm_PUSH_THROUGH_QUARANTINE. >>> It allocates and frees an object from a separate kmem_cache and then performs >>> kmem_cache_alloc()+kmem_cache_free() for that cache 400000 times. >>> This test effectively pushes the object through the heap quarantine and >>> reallocates it after it returns back to the allocator freelist: > [...] >>> As you can see, the number of the allocations that are needed for overwriting >>> the vulnerable object is almost the same. That would be good for stable >>> use-after-free exploitation and should not be allowed. >>> That's why I developed the quarantine randomization (see the patch 4/6). >>> >>> This randomization required very small hackish changes of the heap quarantine >>> mechanism. At first all quarantine batches are filled by objects. Then during >>> the quarantine reducing I randomly choose and free 1/2 of objects from a >>> randomly chosen batch. Now the randomized quarantine releases the freed object >>> at an unpredictable moment: >>> lkdtm: Target object is reallocated at attempt 107884 > [...] >>> lkdtm: Target object is reallocated at attempt 87343 > > Those numbers are fairly big. At that point you might not even fit > into L3 cache anymore, right? You'd often be hitting DRAM for new > allocations? And for many slabs, you might end using much more memory > for the quarantine than for actual in-use allocations. Yes. The original quarantine size is (totalram_pages() << PAGE_SHIFT) / QUARANTINE_FRACTION where #define QUARANTINE_FRACTION 32 > It seems to me like, for this to stop attacks with a high probability, > you'd have to reserve a huge chunk of kernel memory for the > quarantines Yes, that's how it works now. > - even if the attacker doesn't know anything about the > status of the quarantine (which isn't necessarily the case, depending > on whether the attacker can abuse microarchitectural data leakage, or > if the attacker can trigger a pure data read through the dangling > pointer), they should still be able to win with a probability around > quarantine_size/allocated_memory_size if they have a heap spraying > primitive without strict limits. Not sure about this probability evaluation. I will try calculating it taking the quarantine parameters into account. >>> However, this randomization alone would not disturb the attacker, because >>> the quarantine stores the attacker's data (the payload) in the sprayed objects. >>> I.e. the reallocated and overwritten vulnerable object contains the payload >>> until the next reallocation (very bad). >>> >>> Hence heap objects should be erased before going to the heap quarantine. >>> Moreover, filling them by zeros gives a chance to detect use-after-free >>> accesses to non-zero data while an object stays in the quarantine (nice!). >>> That functionality already exists in the kernel, it's called init_on_free. >>> I integrated it with CONFIG_SLAB_QUARANTINE in the patch 3/6. >>> >>> During that work I found a bug: in CONFIG_SLAB init_on_free happens too >>> late, and heap objects go to the KASAN quarantine being dirty. See the fix >>> in the patch 2/6. > [...] >> I've made various tests on real hardware and in virtual machines: >> 1) network throughput test using iperf >> server: iperf -s -f K >> client: iperf -c 127.0.0.1 -t 60 -f K >> 2) scheduler stress test >> hackbench -s 4000 -l 500 -g 15 -f 25 -P >> 3) building the defconfig kernel >> time make -j2 >> >> I compared Linux kernel 5.9.0-rc6 with: >> - init_on_free=off, >> - init_on_free=on, >> - CONFIG_SLAB_QUARANTINE=y (which enables init_on_free). >> >> Each test was performed 5 times. I will show the mean values. >> If you are interested, I can share all the results and calculate standard deviation. >> >> Real hardware, Intel Core i7-6500U CPU >> 1) Network throughput test with iperf >> init_on_free=off: 5467152.2 KBytes/sec >> init_on_free=on: 3937545 KBytes/sec (-28.0% vs init_on_free=off) >> CONFIG_SLAB_QUARANTINE: 3858848.6 KBytes/sec (-2.0% vs init_on_free=on) >> 2) Scheduler stress test with hackbench >> init_on_free=off: 8.5364s >> init_on_free=on: 8.9858s (+5.3% vs init_on_free=off) >> CONFIG_SLAB_QUARANTINE: 17.2232s (+91.7% vs init_on_free=on) > > These numbers seem really high for a mitigation, especially if that > performance hit does not really buy you deterministic protection > against many bugs. Right, I agree. It's a probabilistic protection, and the probability should be calculated. I'll work on that. > [...] >> N.B. There was NO performance optimization made for this version of the heap >> quarantine prototype. The main effort was put into researching its security >> properties (hope for your feedback). Performance optimization will be done in >> further steps, if we see that my work is worth doing. > > But you are pretty much inherently limited in terms of performance by > the effect the quarantine has on the data cache, right? Yes. However, the quarantine parameters can be adjusted. > It seems to me like, if you want to make UAF exploitation harder at > the heap allocator layer, you could do somewhat more effective things > with a probably much smaller performance budget. Things like > preventing the reallocation of virtual kernel addresses with different > types, such that an attacker can only replace a UAF object with > another object of the same type. (That is not an idea I like very much > either, but I would like it more than this proposal.) (E.g. some > browsers implement things along those lines, I believe.) That's interesting, thank you. Best regards, Alexander
On Tue, Oct 6, 2020 at 7:56 PM Alexander Popov <alex.popov@linux.com> wrote: > > On 06.10.2020 01:56, Jann Horn wrote: > > On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote: > >> On 29.09.2020 21:35, Alexander Popov wrote: > >>> This is the second version of the heap quarantine prototype for the Linux > >>> kernel. I performed a deeper evaluation of its security properties and > >>> developed new features like quarantine randomization and integration with > >>> init_on_free. That is fun! See below for more details. > >>> > >>> > >>> Rationale > >>> ========= > >>> > >>> Use-after-free vulnerabilities in the Linux kernel are very popular for > >>> exploitation. There are many examples, some of them: > >>> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html > > Hello Jann, thanks for your reply. > > > I don't think your proposed mitigation would work with much > > reliability against this bug; the attacker has full control over the > > timing of the original use and the following use, so an attacker > > should be able to trigger the kmem_cache_free(), then spam enough new > > VMAs and delete them to flush out the quarantine, and then do heap > > spraying as normal, or something like that. > > The randomized quarantine will release the vulnerable object at an unpredictable > moment (patch 4/6). > > So I think the control over the time of the use-after-free access doesn't help > attackers, if they don't have an "infinite spray" -- unlimited ability to store > controlled data in the kernelspace objects of the needed size without freeing them. > > "Unlimited", because the quarantine size is 1/32 of whole memory. > "Without freeing", because freed objects are erased by init_on_free before going > to randomized heap quarantine (patch 3/6). > > Would you agree? But you have a single quarantine (per CPU) for all objects, right? So for a UAF on slab A, the attacker can just spam allocations and deallocations on slab B to almost deterministically flush everything in slab A back to the SLUB freelists? > > Also, note that here, if the reallocation fails, the kernel still > > wouldn't crash because the dangling object is not accessed further if > > the address range stored in it doesn't match the fault address. So an > > attacker could potentially try multiple times, and if the object > > happens to be on the quarantine the first time, that wouldn't really > > be a showstopper, you'd just try again. > > Freed objects are filled by zero before going to quarantine (patch 3/6). > Would it cause a null pointer dereference on unsuccessful try? Not as far as I can tell. [...] > >> N.B. There was NO performance optimization made for this version of the heap > >> quarantine prototype. The main effort was put into researching its security > >> properties (hope for your feedback). Performance optimization will be done in > >> further steps, if we see that my work is worth doing. > > > > But you are pretty much inherently limited in terms of performance by > > the effect the quarantine has on the data cache, right? > > Yes. > However, the quarantine parameters can be adjusted. > > > It seems to me like, if you want to make UAF exploitation harder at > > the heap allocator layer, you could do somewhat more effective things > > with a probably much smaller performance budget. Things like > > preventing the reallocation of virtual kernel addresses with different > > types, such that an attacker can only replace a UAF object with > > another object of the same type. (That is not an idea I like very much > > either, but I would like it more than this proposal.) (E.g. some > > browsers implement things along those lines, I believe.) > > That's interesting, thank you. Just as some more context of how I think about this: Preventing memory corruption, outside of stuff like core memory management code, isn't really all *that* hard. There are schemes out there for hardware that reliably protects the integrity of data pointers, and such things. And if people can do that in hardware, we can also emulate that, and we'll get the same protection in software. The hard part is making it reasonably fast. And if you are willing to accept the kind of performance impact that comes with gigantic quarantine queues, there might be more effective things to spend that performance on?
On 06.10.2020 21:37, Jann Horn wrote: > On Tue, Oct 6, 2020 at 7:56 PM Alexander Popov <alex.popov@linux.com> wrote: >> >> On 06.10.2020 01:56, Jann Horn wrote: >>> On Thu, Oct 1, 2020 at 9:43 PM Alexander Popov <alex.popov@linux.com> wrote: >>>> On 29.09.2020 21:35, Alexander Popov wrote: >>>>> This is the second version of the heap quarantine prototype for the Linux >>>>> kernel. I performed a deeper evaluation of its security properties and >>>>> developed new features like quarantine randomization and integration with >>>>> init_on_free. That is fun! See below for more details. >>>>> >>>>> >>>>> Rationale >>>>> ========= >>>>> >>>>> Use-after-free vulnerabilities in the Linux kernel are very popular for >>>>> exploitation. There are many examples, some of them: >>>>> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html >> >> Hello Jann, thanks for your reply. >> >>> I don't think your proposed mitigation would work with much >>> reliability against this bug; the attacker has full control over the >>> timing of the original use and the following use, so an attacker >>> should be able to trigger the kmem_cache_free(), then spam enough new >>> VMAs and delete them to flush out the quarantine, and then do heap >>> spraying as normal, or something like that. >> >> The randomized quarantine will release the vulnerable object at an unpredictable >> moment (patch 4/6). >> >> So I think the control over the time of the use-after-free access doesn't help >> attackers, if they don't have an "infinite spray" -- unlimited ability to store >> controlled data in the kernelspace objects of the needed size without freeing them. >> >> "Unlimited", because the quarantine size is 1/32 of whole memory. >> "Without freeing", because freed objects are erased by init_on_free before going >> to randomized heap quarantine (patch 3/6). >> >> Would you agree? > > But you have a single quarantine (per CPU) for all objects, right? So > for a UAF on slab A, the attacker can just spam allocations and > deallocations on slab B to almost deterministically flush everything > in slab A back to the SLUB freelists? Aaaahh! Nice shot Jann, I see. Another slab cache can be used to flush the randomized quarantine, so eventually the vulnerable object returns into the allocator freelist in its cache, and original heap spraying can be used again. For now I think the idea of a global quarantine for all slab objects is dead. Thank you. Best regards, Alexander