Message ID | 20241206101110.1646108-1-kevin.brodsky@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | pkeys-based page table hardening | expand |
On Fri, Dec 6, 2024 at 11:13 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote: > This is a proposal to leverage protection keys (pkeys) to harden > critical kernel data, by making it mostly read-only. The series includes > a simple framework called "kpkeys" to manipulate pkeys for in-kernel use, > as well as a page table hardening feature based on that framework > (kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of > concept, but they are designed to be compatible with any architecture > implementing pkeys. > > The proposed approach is a typical use of pkeys: the data to protect is > mapped with a given pkey P, and the pkey register is initially configured > to grant read-only access to P. Where the protected data needs to be > written to, the pkey register is temporarily switched to grant write > access to P on the current CPU. > > The key fact this approach relies on is that the target data is > only written to via a limited and well-defined API. This makes it > possible to explicitly switch the pkey register where needed, without > introducing excessively invasive changes, and only for a small amount of > trusted code. > > Page tables were chosen as they are a popular (and critical) target for > attacks, but there are of course many others - this is only a starting > point (see section "Further use-cases"). It has become more and more > common for accesses to such target data to be mediated by a hypervisor > in vendor kernels; the hope is that kpkeys can provide much of that > protection in a simpler manner. No benchmarking has been performed at > this stage, but the runtime overhead should also be lower (though likely > not negligible). Yeah, it isn't great that vendor kernels contain such invasive changes... I guess one difference between this approach and a hypervisor-based approach is that a hypervisor that uses a second layer of page tables can also prevent access through aliasing mappings, while pkeys only prevent access through a specific mapping? (Like if an attacker managed to add a page that is mapped into userspace to a page allocator freelist, allocate this page as a page table, and use the userspace mapping to write into this page table. But I guess whether that is an issue depends on the threat model.) > # kpkeys_hardened_pgtables > > The kpkeys_hardened_pgtables feature uses the interface above to make > the (kernel and user) page tables read-only by default, enabling write > access only in helpers such as set_pte(). One complication is that those > helpers as well as page table allocators are used very early, before > kpkeys become available. Enabling kpkeys_hardened_pgtables, if and when > kpkeys become available, is therefore done as follows: > > 1. A static key is turned on. This enables a transition to > KPKEYS_LVL_PGTABLES in all helpers writing to page tables, and also > impacts page table allocators (see step 3). > > 2. All pages holding kernel page tables are set to KPKEYS_PKEY_PGTABLES. > This ensures they can only be written when runnning at > KPKEYS_LVL_PGTABLES. > > 3. Page table allocators set the returned pages to KPKEYS_PKEY_PGTABLES > (and the pkey is reset upon freeing). This ensures that all page > tables are mapped with that privileged pkey. > > # Threat model > > The proposed scheme aims at mitigating data-only attacks (e.g. > use-after-free/cross-cache attacks). In other words, it is assumed that > control flow is not corrupted, and that the attacker does not achieve > arbitrary code execution. Nothing prevents the pkey register from being > set to its most permissive state - the assumption is that the register > is only modified on legitimate code paths. Is the threat model that the attacker has already achieved full read/write access to unprotected kernel data and should be stopped from gaining write access to protected data? Or is the threat model that the attacker has achieved some limited corruption, and this series is intended to make it harder to either gain write access to protected data or achieve full read/write access to unprotected data?
On 06/12/2024 20:14, Jann Horn wrote: > On Fri, Dec 6, 2024 at 11:13 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote: >> [...] >> >> Page tables were chosen as they are a popular (and critical) target for >> attacks, but there are of course many others - this is only a starting >> point (see section "Further use-cases"). It has become more and more >> common for accesses to such target data to be mediated by a hypervisor >> in vendor kernels; the hope is that kpkeys can provide much of that >> protection in a simpler manner. No benchmarking has been performed at >> this stage, but the runtime overhead should also be lower (though likely >> not negligible). > Yeah, it isn't great that vendor kernels contain such invasive changes... > > I guess one difference between this approach and a hypervisor-based > approach is that a hypervisor that uses a second layer of page tables > can also prevent access through aliasing mappings, while pkeys only > prevent access through a specific mapping? (Like if an attacker > managed to add a page that is mapped into userspace to a page > allocator freelist, allocate this page as a page table, and use the > userspace mapping to write into this page table. But I guess whether > that is an issue depends on the threat model.) Yes, that's correct. If an attacker is able to modify page tables then kpkeys are easily defeated. (kpkeys_hardened_pgtables does mitigate precisely that, though.) On the topic of aliases, it's worth noting that this isn't an issue with page table pages (only the linear mapping is used), but if we wanted to assigning a pkey to vmalloc areas we'd also have to amend the linear mapping. >> [...] >> >> # Threat model >> >> The proposed scheme aims at mitigating data-only attacks (e.g. >> use-after-free/cross-cache attacks). In other words, it is assumed that >> control flow is not corrupted, and that the attacker does not achieve >> arbitrary code execution. Nothing prevents the pkey register from being >> set to its most permissive state - the assumption is that the register >> is only modified on legitimate code paths. > Is the threat model that the attacker has already achieved full > read/write access to unprotected kernel data and should be stopped > from gaining write access to protected data? Or is the threat model > that the attacker has achieved some limited corruption, and this > series is intended to make it harder to either gain write access to > protected data or achieve full read/write access to unprotected data? The assumption is that the attacker has acquired a write primitive that could potentially allow corrupting any kernel data. The objective is to make it harder to exploit that primitive by making critical data immune to it. Nothing stops the attacker to turn to another (unprotected) target, but this is no different from hypervisor-based protection - the hope is that removing the low-hanging fruits makes it too difficult to build a complete exploit chain. - Kevin