Message ID | 20220921020140.3240092-9-mhal@rbox.co (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: gfn_to_pfn_cache cleanups and a fix | expand |
On Wed, Sep 21, 2022, Michal Luczaj wrote: > There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate() > resulting in a near-NULL pointer write. > > 1. Deactivate shinfo cache: > > kvm_xen_hvm_set_attr > case KVM_XEN_ATTR_TYPE_SHARED_INFO > kvm_gpc_deactivate > kvm_gpc_unmap > gpc->valid = false > gpc->khva = NULL > gpc->active = false > > Result: active = false, valid = false > > 2. Cause cache refresh: > > kvm_arch_vm_ioctl > case KVM_XEN_HVM_EVTCHN_SEND > kvm_xen_hvm_evtchn_send > kvm_xen_set_evtchn > kvm_xen_set_evtchn_fast > kvm_gpc_check > return -EWOULDBLOCK because !gpc->valid > kvm_xen_set_evtchn_fast > return -EWOULDBLOCK > kvm_gpc_refresh > hva_to_pfn_retry > gpc->valid = true > gpc->khva = not NULL > > Result: active = false, valid = true This is the real bug. KVM should not succesfully refresh an inactive cache. It's not just the potential for NULL pointer deref, the cache also isn't on the list of active caches, i.e. won't get mmu_notifier events, and so KVM could get a use-after-free of userspace memory. KVM_XEN_HVM_EVTCHN_SEND does check that the per-vCPU cache is active, but does so outside of the gpc->lock. Minus your race condition analysis, which I'll insert into the changelog (assuming this works), I believe the proper fix is to check "active" during check and refresh. Oof, and there are ordering bugs too. Compile-tested patch below. If this fixes things on your end (I'll properly test tomorrow too), I'll post a v2 of the entire series. There are some cleanups that can be done on top, e.g. I think we should drop kvm_gpc_unmap() entirely until there's actually a user, because it's not at all obvious that it's (a) necessary and (b) has desirable behavior. Note, the below patch applies after patch 1 of this series. I don't know if anyone will actually want to backport the fix, but it's not too hard to keep the backport dependency to just patch 1. -- From: Sean Christopherson <seanjc@google.com> Date: Mon, 10 Oct 2022 13:06:13 -0700 Subject: [PATCH] KVM: Reject attempts to consume or refresh inactive gfn_to_pfn_cache Reject kvm_gpc_check() and kvm_gpc_refresh() if the cache is inactive. No checking the active flag during refresh is particular egregious, as KVM can end up with a valid, inactive cache, which can lead to a variety of use-after-free bugs, e.g. consuming a NULL kernel pointer or missing an mmu_notifier invalidation due to the cache not being on the list of gfns to invalidate. Note, "active" needs to be set if and only if the cache is on the list of caches, i.e. is reachable via mmu_notifier events. If a relevant mmu_notifier event occurs while the cache is "active" but not on the list, KVM will not acquire the cache's lock and so will not serailize the mmu_notifier event with active users and/or kvm_gpc_refresh(). A race between KVM_XEN_ATTR_TYPE_SHARED_INFO and KVM_XEN_HVM_EVTCHN_SEND can be exploited to trigger the bug. <will insert your awesome race analysis> Reported-by: : Michal Luczaj <mhal@rbox.co> Signed-off-by: Sean Christopherson <seanjc@google.com> --- virt/kvm/pfncache.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index b32ed4a7c900..dfc72aa88d71 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -81,6 +81,9 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, { struct kvm_memslots *slots = kvm_memslots(kvm); + if (!gpc->active) + return false; + if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE) return false; @@ -240,8 +243,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, { struct kvm_memslots *slots = kvm_memslots(kvm); unsigned long page_offset = gpa & ~PAGE_MASK; - kvm_pfn_t old_pfn, new_pfn; + bool unmap_old = false; unsigned long old_uhva; + kvm_pfn_t old_pfn; void *old_khva; int ret = 0; @@ -261,6 +265,9 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, write_lock_irq(&gpc->lock); + if (!gpc->active) + goto out_unlock; + old_pfn = gpc->pfn; old_khva = gpc->khva - offset_in_page(gpc->khva); old_uhva = gpc->uhva; @@ -305,14 +312,15 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpc->khva = NULL; } - /* Snapshot the new pfn before dropping the lock! */ - new_pfn = gpc->pfn; + /* Detect a pfn change before dropping the lock! */ + unmap_old = (old_pfn != gpc->pfn); +out_unlock: write_unlock_irq(&gpc->lock); mutex_unlock(&gpc->refresh_lock); - if (old_pfn != new_pfn) + if (unmap_old) gpc_unmap_khva(kvm, old_pfn, old_khva); return ret; @@ -368,11 +376,19 @@ int kvm_gpc_activate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, gpc->vcpu = vcpu; gpc->usage = usage; gpc->valid = false; - gpc->active = true; spin_lock(&kvm->gpc_lock); list_add(&gpc->list, &kvm->gpc_list); spin_unlock(&kvm->gpc_lock); + + /* + * Activate the cache after adding it to the list, a concurrent + * refresh must not establish a mapping until the cache is + * reachable by mmu_notifier events. + */ + write_lock_irq(&gpc->lock); + gpc->active = true; + write_unlock_irq(&gpc->lock); } return kvm_gfn_to_pfn_cache_refresh(kvm, gpc, gpa, len); } @@ -381,12 +397,20 @@ EXPORT_SYMBOL_GPL(kvm_gpc_activate); void kvm_gpc_deactivate(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) { if (gpc->active) { + /* + * Deactivate the cache before removing it from the list, KVM + * must stall mmu_notifier events until all users go away, i.e. + * until gpc->lock is dropped and refresh is guaranteed to fail. + */ + write_lock_irq(&gpc->lock); + gpc->active = false; + write_unlock_irq(&gpc->lock); + spin_lock(&kvm->gpc_lock); list_del(&gpc->list); spin_unlock(&kvm->gpc_lock); kvm_gfn_to_pfn_cache_unmap(kvm, gpc); - gpc->active = false; } } EXPORT_SYMBOL_GPL(kvm_gpc_deactivate); base-commit: 09e5b3d617d28e3011253370f827151cc6cba6ad --
On Mon, Oct 10, 2022, Sean Christopherson wrote: > On Wed, Sep 21, 2022, Michal Luczaj wrote: > If this fixes things on your end (I'll properly test tomorrow too), I'll post a > v2 of the entire series. There are some cleanups that can be done on top, e.g. > I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > because it's not at all obvious that it's (a) necessary and (b) has desirable > behavior. Sorry for the delay, I initially missed that you included a selftest for the race in the original RFC. The kernel is no longer exploding, but the test is intermittently soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow.
On Thu, Oct 13, 2022, Sean Christopherson wrote: > On Mon, Oct 10, 2022, Sean Christopherson wrote: > > On Wed, Sep 21, 2022, Michal Luczaj wrote: > > If this fixes things on your end (I'll properly test tomorrow too), I'll post a > > v2 of the entire series. There are some cleanups that can be done on top, e.g. > > I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > > because it's not at all obvious that it's (a) necessary and (b) has desirable > > behavior. > > Sorry for the delay, I initially missed that you included a selftest for the race > in the original RFC. The kernel is no longer exploding, but the test is intermittently > soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. Ended up being a test bug (technically). KVM drops the timer IRQ if the shared info page is invalid. As annoying as that is, there's isn't really a better option, and invalidating a shared page while vCPUs are running really is a VMM bug. To fix, I added an intermediate stage in the test that re-arms the timer if the IRQ doesn't arrive in a reasonable amount of time. Patches incoming...
On 10/13/22 22:28, Sean Christopherson wrote: > On Thu, Oct 13, 2022, Sean Christopherson wrote: >> On Mon, Oct 10, 2022, Sean Christopherson wrote: >>> On Wed, Sep 21, 2022, Michal Luczaj wrote: >>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a >>> v2 of the entire series. There are some cleanups that can be done on top, e.g. >>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user, >>> because it's not at all obvious that it's (a) necessary and (b) has desirable >>> behavior. >> >> Sorry for the delay, I initially missed that you included a selftest for the race >> in the original RFC. The kernel is no longer exploding, but the test is intermittently >> soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. > > Ended up being a test bug (technically). KVM drops the timer IRQ if the shared > info page is invalid. As annoying as that is, there's isn't really a better > option, and invalidating a shared page while vCPUs are running really is a VMM > bug. > > To fix, I added an intermediate stage in the test that re-arms the timer if the > IRQ doesn't arrive in a reasonable amount of time. > > Patches incoming... Sorry for the late reply, I was away. Thank you for the whole v2 series. And I'm glad you've found my testcase useful, even if a bit buggy ;) Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? I've noticed that kvm_xen_schedop_poll() fetches guest-provided sched_poll.ports without checking if the values are sane. Then, in wait_pending_event(), there's test_bit(ports[i], pending_bits) which (for some high ports[i] values) results in KASAN complaining about "use-after-free": [ 36.463417] ================================================================== [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.463962] Read of size 8 at addr ffff88815b87b800 by task xen_shinfo_test/956 [ 36.464149] CPU: 1 PID: 956 Comm: xen_shinfo_test Not tainted 6.1.0-rc1-kasan+ #49 [ 36.464252] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 36.464259] Call Trace: [ 36.464259] <TASK> [ 36.464259] dump_stack_lvl+0x5b/0x73 [ 36.464259] print_report+0x17f/0x477 [ 36.464259] ? __virt_addr_valid+0xd5/0x150 [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] kasan_report+0xbb/0xf0 [ 36.464259] ? kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] kasan_check_range+0x136/0x1b0 [ 36.464259] kvm_xen_hypercall+0xf39/0x1110 [kvm] [ 36.464259] ? kvm_xen_set_evtchn.part.0+0x190/0x190 [kvm] [ 36.464259] ? get_kvmclock+0x86/0x360 [kvm] [ 36.464259] ? pvclock_clocksource_read+0x13a/0x190 [ 36.464259] kvm_emulate_hypercall+0x1d7/0x860 [kvm] [ 36.464259] ? get_kvmclock+0x151/0x360 [kvm] [ 36.464259] ? kvm_fast_pio+0x260/0x260 [kvm] [ 36.464259] ? kvm_post_set_cr4+0xf0/0xf0 [kvm] [ 36.464259] ? lock_release+0x9c/0x430 [ 36.464259] ? rcu_qs+0x2b/0xb0 [ 36.464259] ? rcu_note_context_switch+0x18e/0x9b0 [ 36.464259] ? rcu_read_lock_sched_held+0x10/0x70 [ 36.464259] ? lock_acquire+0xb1/0x3d0 [ 36.464259] ? vmx_vmexit+0x6c/0x19d [kvm_intel] [ 36.464259] ? vmx_vmexit+0x8d/0x19d [kvm_intel] [ 36.464259] ? rcu_read_lock_sched_held+0x10/0x70 [ 36.464259] ? lock_acquire+0xb1/0x3d0 [ 36.464259] ? lock_downgrade+0x380/0x380 [ 36.464259] ? vmx_vcpu_run+0x5bf/0x1260 [kvm_intel] [ 36.464259] vmx_handle_exit+0x295/0xa50 [kvm_intel] [ 36.464259] vcpu_enter_guest.constprop.0+0x1436/0x1ed0 [kvm] [ 36.464259] ? kvm_check_and_inject_events+0x800/0x800 [kvm] [ 36.464259] ? lock_downgrade+0x380/0x380 [ 36.464259] ? __blkcg_punt_bio_submit+0xd0/0xd0 [ 36.464259] ? kvm_arch_vcpu_ioctl_run+0xa46/0xf70 [kvm] [ 36.464259] ? unlock_page_memcg+0x1e0/0x1e0 [ 36.464259] ? __local_bh_enable_ip+0x8f/0x100 [ 36.464259] ? trace_hardirqs_on+0x2d/0xf0 [ 36.464259] ? fpu_swap_kvm_fpstate+0xbd/0x1c0 [ 36.464259] ? kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm] [ 36.464259] kvm_arch_vcpu_ioctl_run+0x418/0xf70 [kvm] [ 36.464259] kvm_vcpu_ioctl+0x332/0x8f0 [kvm] [ 36.464259] ? kvm_clear_dirty_log_protect+0x430/0x430 [kvm] [ 36.464259] ? do_vfs_ioctl+0x951/0xbf0 [ 36.464259] ? vfs_fileattr_set+0x480/0x480 [ 36.464259] ? kernel_write+0x360/0x360 [ 36.464259] ? selinux_inode_getsecctx+0x50/0x50 [ 36.464259] ? ioctl_has_perm.constprop.0.isra.0+0x133/0x200 [ 36.464259] ? selinux_inode_getsecctx+0x50/0x50 [ 36.464259] ? ksys_write+0xc4/0x140 [ 36.464259] ? __ia32_sys_read+0x40/0x40 [ 36.464259] ? lockdep_hardirqs_on_prepare+0xe/0x220 [ 36.464259] __x64_sys_ioctl+0xb8/0xf0 [ 36.464259] do_syscall_64+0x55/0x80 [ 36.464259] ? lockdep_hardirqs_on_prepare+0xe/0x220 [ 36.464259] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 36.464259] RIP: 0033:0x7f81e303ec6b [ 36.464259] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 [ 36.464259] RSP: 002b:00007fff72009028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 36.464259] RAX: ffffffffffffffda RBX: 00007f81e33936c0 RCX: 00007f81e303ec6b [ 36.464259] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007 [ 36.464259] RBP: 00000000008457b0 R08: 0000000000000000 R09: 00000000004029f6 [ 36.464259] R10: 00007f81e31b838b R11: 0000000000000246 R12: 00007f81e33a4000 [ 36.464259] R13: 00007f81e33a2000 R14: 0000000000000000 R15: 00007f81e33a3020 [ 36.464259] </TASK> [ 36.464259] The buggy address belongs to the physical page: [ 36.464259] page:00000000d9b176a3 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x1 pfn:0x15b87b [ 36.464259] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 36.464259] raw: 0017ffffc0000000 ffffea0005504e48 ffffea00056cf688 0000000000000000 [ 36.464259] raw: 0000000000000001 0000000000000000 00000000ffffff7f 0000000000000000 [ 36.464259] page dumped because: kasan: bad access detected [ 36.464259] Memory state around the buggy address: [ 36.464259] ffff88815b87b700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ffff88815b87b780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] >ffff88815b87b800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ^ [ 36.464259] ffff88815b87b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ffff88815b87b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 36.464259] ================================================================== I can't reproduce it under non-KASAN build, I'm not sure what's going on. Anyway, here's the testcase, applies after your selftests patches in https://lore.kernel.org/kvm/20221013211234.1318131-1-seanjc@google.com/ --- .../selftests/kvm/x86_64/xen_shinfo_test.c | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c index 2a5727188c8d..402e3d7b86b0 100644 --- a/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c +++ b/tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c @@ -375,6 +375,29 @@ static void guest_code(void) guest_saw_irq = false; GUEST_SYNC(24); + /* Terminate racer thread */ + + GUEST_SYNC(25); + /* Test SCHEDOP_poll out-of-bounds read */ + + p = (struct sched_poll) { + .ports = ports, + .nr_ports = 1, + .timeout = 1 + }; + + ports[0] = (PAGE_SIZE*2) << 3; + for (i = 0; i < 0x1000; i++) { + asm volatile("vmcall" + : "=a" (rax) + : "a" (__HYPERVISOR_sched_op), + "D" (SCHEDOP_poll), + "S" (&p) + : "memory"); + ports[0] += PAGE_SIZE << 3; + } + + GUEST_SYNC(26); } static int cmp_timespec(struct timespec *a, struct timespec *b) @@ -925,6 +948,21 @@ int main(int argc, char *argv[]) ret = pthread_join(thread, 0); TEST_ASSERT(ret == 0, "pthread_join() failed: %s", strerror(ret)); + + /* shinfo juggling done; reset to a valid GFN. */ + struct kvm_xen_hvm_attr ha = { + .type = KVM_XEN_ATTR_TYPE_SHARED_INFO, + .u.shared_info.gfn = SHINFO_REGION_GPA / PAGE_SIZE, + }; + vm_ioctl(vm, KVM_XEN_HVM_SET_ATTR, &ha); + break; + + case 25: + if (verbose) + printf("Testing SCHEDOP_poll out-of-bounds read\n"); + break; + + case 26: goto done; case 0x20:
On Thu, Oct 20, 2022, Michal Luczaj wrote: > On 10/13/22 22:28, Sean Christopherson wrote: > > On Thu, Oct 13, 2022, Sean Christopherson wrote: > >> On Mon, Oct 10, 2022, Sean Christopherson wrote: > >>> On Wed, Sep 21, 2022, Michal Luczaj wrote: > >>> If this fixes things on your end (I'll properly test tomorrow too), I'll post a > >>> v2 of the entire series. There are some cleanups that can be done on top, e.g. > >>> I think we should drop kvm_gpc_unmap() entirely until there's actually a user, > >>> because it's not at all obvious that it's (a) necessary and (b) has desirable > >>> behavior. > >> > >> Sorry for the delay, I initially missed that you included a selftest for the race > >> in the original RFC. The kernel is no longer exploding, but the test is intermittently > >> soft hanging waiting for the "IRQ". I'll debug and hopefully post tomorrow. > > > > Ended up being a test bug (technically). KVM drops the timer IRQ if the shared > > info page is invalid. As annoying as that is, there's isn't really a better > > option, and invalidating a shared page while vCPUs are running really is a VMM > > bug. > > > > To fix, I added an intermediate stage in the test that re-arms the timer if the > > IRQ doesn't arrive in a reasonable amount of time. > > > > Patches incoming... > > Sorry for the late reply, I was away. > Thank you for the whole v2 series. And I'm glad you've found my testcase > useful, even if a bit buggy ;) > > Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? I highly doubt they are trusted. > I've noticed that kvm_xen_schedop_poll() fetches guest-provided > sched_poll.ports without checking if the values are sane. Then, in > wait_pending_event(), there's test_bit(ports[i], pending_bits) which > (for some high ports[i] values) results in KASAN complaining about > "use-after-free": > > [ 36.463417] ================================================================== > [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] ... > I can't reproduce it under non-KASAN build, I'm not sure what's going on. KASAN is rightly complaining because, as you already pointed out, the high ports[i] value will touch memory well beyond the shinfo->evtchn_pending array. Non-KASAN builds don't have visible failures because the rogue access is only a read, and the result of the test_bit() only affects whether or not KVM temporarily stalls the vCPU. In other words, KVM is leaking host state to the guest, but there is no memory corruption and no functional impact on the guest. I think this would be the way to fix this particular mess? diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index 93c628d3e3a9..5d09a47db732 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, struct kvm *kvm = vcpu->kvm; struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache; unsigned long *pending_bits; + unsigned long nr_bits; unsigned long flags; + evtchn_port_t port; bool ret = true; int idx, i; @@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { struct shared_info *shinfo = gpc->khva; pending_bits = (unsigned long *)&shinfo->evtchn_pending; + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; } else { struct compat_shared_info *shinfo = gpc->khva; pending_bits = (unsigned long *)&shinfo->evtchn_pending; + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; } for (i = 0; i < nr_ports; i++) { - if (test_bit(ports[i], pending_bits)) { + port = ports[i]; + if (port >= nr_bits) + continue; + + if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) { ret = true; break; }
On 10/20/22 18:58, Sean Christopherson wrote: > On Thu, Oct 20, 2022, Michal Luczaj wrote: >> Speaking about SCHEDOP_poll, are XEN vmcalls considered trusted? > > I highly doubt they are trusted. Does it mean a CPL3 guest can vmcall SCHEDOP_poll? If so, is the use of kvm_mmu_gva_to_gpa_system() justified? >> I've noticed that kvm_xen_schedop_poll() fetches guest-provided >> sched_poll.ports without checking if the values are sane. Then, in >> wait_pending_event(), there's test_bit(ports[i], pending_bits) which >> (for some high ports[i] values) results in KASAN complaining about >> "use-after-free": >> >> [ 36.463417] ================================================================== >> [ 36.463564] BUG: KASAN: use-after-free in kvm_xen_hypercall+0xf39/0x1110 [kvm] > > ... > >> I can't reproduce it under non-KASAN build, I'm not sure what's going on. > > KASAN is rightly complaining because, as you already pointed out, the high ports[i] > value will touch memory well beyond the shinfo->evtchn_pending array. Non-KASAN > builds don't have visible failures because the rogue access is only a read, and > the result of the test_bit() only affects whether or not KVM temporarily stalls > the vCPU. In other words, KVM is leaking host state to the guest, but there is > no memory corruption and no functional impact on the guest. OK, so such vCPU stall-or-not is a side channel leaking host memory bit by bit, right? I'm trying to understand what is actually being leaked here. Is it user space memory of process that is using KVM on the host? > I think this would be the way to fix this particular mess? > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index 93c628d3e3a9..5d09a47db732 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -961,7 +961,9 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, > struct kvm *kvm = vcpu->kvm; > struct gfn_to_pfn_cache *gpc = &kvm->arch.xen.shinfo_cache; > unsigned long *pending_bits; > + unsigned long nr_bits; > unsigned long flags; > + evtchn_port_t port; > bool ret = true; > int idx, i; > > @@ -974,13 +976,19 @@ static bool wait_pending_event(struct kvm_vcpu *vcpu, int nr_ports, > if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { > struct shared_info *shinfo = gpc->khva; > pending_bits = (unsigned long *)&shinfo->evtchn_pending; > + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; > } else { > struct compat_shared_info *shinfo = gpc->khva; > pending_bits = (unsigned long *)&shinfo->evtchn_pending; > + nr_bits = sizeof(shinfo->evtchn_pending) * BITS_PER_BYTE; > } > > for (i = 0; i < nr_ports; i++) { > - if (test_bit(ports[i], pending_bits)) { > + port = ports[i]; > + if (port >= nr_bits) > + continue; > + > + if (test_bit(array_index_nospec(port, nr_bits), pending_bits)) { > ret = true; > break; > } Great, that looks good and passes the test. Thanks, Michal
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 45b9b96c0ea3..e987669c3506 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -364,11 +364,13 @@ EXPORT_SYMBOL_GPL(kvm_gpc_init); int kvm_gpc_activate(struct gfn_to_pfn_cache *gpc, gpa_t gpa) { if (!gpc->active) { + write_lock_irq(&gpc->lock); gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT; gpc->uhva = KVM_HVA_ERR_BAD; gpc->valid = false; gpc->active = true; + write_unlock_irq(&gpc->lock); spin_lock(&gpc->kvm->gpc_lock); list_add(&gpc->list, &gpc->kvm->gpc_list);
There's a race between kvm_xen_set_evtchn_fast() and kvm_gpc_activate() resulting in a near-NULL pointer write. 1. Deactivate shinfo cache: kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO kvm_gpc_deactivate kvm_gpc_unmap gpc->valid = false gpc->khva = NULL gpc->active = false Result: active = false, valid = false 2. Cause cache refresh: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast kvm_gpc_check return -EWOULDBLOCK because !gpc->valid kvm_xen_set_evtchn_fast return -EWOULDBLOCK kvm_gpc_refresh hva_to_pfn_retry gpc->valid = true gpc->khva = not NULL Result: active = false, valid = true 3. Race ioctl KVM_XEN_HVM_EVTCHN_SEND against ioctl KVM_XEN_ATTR_TYPE_SHARED_INFO: kvm_arch_vm_ioctl case KVM_XEN_HVM_EVTCHN_SEND kvm_xen_hvm_evtchn_send kvm_xen_set_evtchn kvm_xen_set_evtchn_fast read_lock gpc->lock kvm_xen_hvm_set_attr case KVM_XEN_ATTR_TYPE_SHARED_INFO mutex_lock kvm->lock kvm_xen_shared_info_init kvm_gpc_activate gpc->khva = NULL kvm_gpc_check [ Check passes because gpc->valid is still true, even though gpc->khva is already NULL. ] shinfo = gpc->khva pending_bits = shinfo->evtchn_pending CRASH: test_and_set_bit(..., pending_bits) Protect kvm_gpc_activate() cache properties writes by write lock gpc->lock. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- Attaching more details: [ 86.127703] BUG: kernel NULL pointer dereference, address: 0000000000000800 [ 86.127751] #PF: supervisor write access in kernel mode [ 86.127778] #PF: error_code(0x0002) - not-present page [ 86.127801] PGD 105792067 P4D 105792067 PUD 105793067 PMD 0 [ 86.127826] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 86.127850] CPU: 0 PID: 945 Comm: xen_shinfo_test Not tainted 6.0.0-rc5-test+ #31 [ 86.127874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 86.127898] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.127960] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.127982] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.128001] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.128021] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.128040] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.128060] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.128079] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.128098] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.128118] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.128138] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.128158] PKRU: 55555554 [ 86.128177] Call Trace: [ 86.128196] <TASK> [ 86.128215] kvm_xen_hvm_evtchn_send (arch/x86/kvm/xen.c:1432 arch/x86/kvm/xen.c:1562) kvm [ 86.128256] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 86.128294] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128315] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 86.128335] ? kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128368] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 86.128401] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 86.128422] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 86.128442] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 86.128462] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 86.128482] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128501] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128520] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128539] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128558] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128577] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128596] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128615] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128634] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 86.128653] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 86.128673] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 86.128692] RIP: 0033:0x7f71d6152c6b [ 86.128712] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 86.128735] RSP: 002b:00007fff7c716c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 86.128755] RAX: ffffffffffffffda RBX: 00007f71d61116c0 RCX: 00007f71d6152c6b [ 86.128775] RDX: 00007fff7c716f00 RSI: 00000000400caed0 RDI: 0000000000000004 [ 86.128794] RBP: 0000000001f2b2a0 R08: 0000000000417343 R09: 00007f71d62cc341 [ 86.128814] R10: 00007fff7c74c258 R11: 0000000000000246 R12: 00007f71d6329000 [ 86.128833] R13: 00000000632870b9 R14: 0000000000000001 R15: 00007f71d632a020 [ 86.128854] </TASK> [ 86.128873] Modules linked in: kvm_intel 9p fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr sunrpc intel_rapl_msr intel_rapl_common kvm irqbypass rapl pcspkr 9pnet_virtio i2c_piix4 9pnet drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_console serio_raw ata_generic virtio_blk pata_acpi qemu_fw_cfg ipmi_devintf ipmi_msghandler fuse [ 86.128893] Unloaded tainted modules: kvm_intel():1 [last unloaded: kvm_intel] [ 86.128944] CR2: 0000000000000800 [ 86.128962] ---[ end trace 0000000000000000 ]--- [ 86.128982] RIP: 0010:kvm_xen_set_evtchn_fast (./arch/x86/include/asm/bitops.h:138 ./include/asm-generic/bitops/instrumented-atomic.h:72 arch/x86/kvm/xen.c:1370) kvm [ 86.129022] Code: 5a 84 c0 0f 84 01 01 00 00 80 bb b4 9f 00 00 00 8b 45 00 48 8b 93 c8 a0 00 00 75 4d 41 89 c0 48 8d b2 80 08 00 00 41 c1 e8 05 <f0> 48 0f ab 82 00 08 00 00 0f 82 19 01 00 00 8b 45 00 48 0f a3 06 All code ======== 0: 5a pop %rdx 1: 84 c0 test %al,%al 3: 0f 84 01 01 00 00 je 0x10a 9: 80 bb b4 9f 00 00 00 cmpb $0x0,0x9fb4(%rbx) 10: 8b 45 00 mov 0x0(%rbp),%eax 13: 48 8b 93 c8 a0 00 00 mov 0xa0c8(%rbx),%rdx 1a: 75 4d jne 0x69 1c: 41 89 c0 mov %eax,%r8d 1f: 48 8d b2 80 08 00 00 lea 0x880(%rdx),%rsi 26: 41 c1 e8 05 shr $0x5,%r8d 2a:* f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) <-- trapping instruction 31: 00 00 33: 0f 82 19 01 00 00 jb 0x152 39: 8b 45 00 mov 0x0(%rbp),%eax 3c: 48 0f a3 06 bt %rax,(%rsi) Code starting with the faulting instruction =========================================== 0: f0 48 0f ab 82 00 08 lock bts %rax,0x800(%rdx) 7: 00 00 9: 0f 82 19 01 00 00 jb 0x128 f: 8b 45 00 mov 0x0(%rbp),%eax 12: 48 0f a3 06 bt %rax,(%rsi) [ 86.129044] RSP: 0018:ffffc90001367c50 EFLAGS: 00010046 [ 86.129064] RAX: 0000000000000001 RBX: ffffc90001369000 RCX: 0000000000000001 [ 86.129083] RDX: 0000000000000000 RSI: 0000000000000a00 RDI: ffffffff82886a66 [ 86.129103] RBP: ffffc90001367ca8 R08: 0000000000000000 R09: 000000006cc00c97 [ 86.129122] R10: ffff88810c150000 R11: 0000000076cc00c9 R12: 0000000000000001 [ 86.129142] R13: ffffc90001372ff8 R14: ffff8881045c0000 R15: ffffc90001373830 [ 86.129161] FS: 00007f71d6111740(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 [ 86.129181] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.129201] CR2: 0000000000000800 CR3: 0000000104774006 CR4: 0000000000772ef0 [ 86.129221] PKRU: 55555554 [ 86.129240] note: xen_shinfo_test[945] exited with preempt_count 1 [ 151.131754] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 151.131793] rcu: 1-...0: (13 ticks this GP) idle=785c/1/0x4000000000000000 softirq=4758/4760 fqs=16038 [ 151.131816] (detected by 2, t=65002 jiffies, g=6449, q=1299 ncpus=4) [ 151.131837] Sending NMI from CPU 2 to CPUs 1: [ 151.131862] NMI backtrace for cpu 1 [ 151.131864] CPU: 1 PID: 949 Comm: xen_shinfo_test Tainted: G D 6.0.0-rc5-test+ #31 [ 151.131866] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.0-3-3 04/01/2014 [ 151.131866] RIP: 0010:queued_write_lock_slowpath (kernel/locking/qrwlock.c:85) [ 151.131870] Code: ff 90 48 89 df 5b 5d e9 86 fd ff ff f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 8b 03 <3d> 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 c0 eb e2 89 c6 48 89 ef All code ======== 0: ff 90 48 89 df 5b call *0x5bdf8948(%rax) 6: 5d pop %rbp 7: e9 86 fd ff ff jmp 0xfffffffffffffd92 c: f0 81 0b 00 01 00 00 lock orl $0x100,(%rbx) 13: ba ff 00 00 00 mov $0xff,%edx 18: b9 00 01 00 00 mov $0x100,%ecx 1d: 8b 03 mov (%rbx),%eax 1f: 3d 00 01 00 00 cmp $0x100,%eax 24: 74 0b je 0x31 26: f3 90 pause 28: 8b 03 mov (%rbx),%eax 2a:* 3d 00 01 00 00 cmp $0x100,%eax <-- trapping instruction 2f: 75 f5 jne 0x26 31: 89 c8 mov %ecx,%eax 33: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 37: 74 c0 je 0xfffffffffffffff9 39: eb e2 jmp 0x1d 3b: 89 c6 mov %eax,%esi 3d: 48 89 ef mov %rbp,%rdi Code starting with the faulting instruction =========================================== 0: 3d 00 01 00 00 cmp $0x100,%eax 5: 75 f5 jne 0xfffffffffffffffc 7: 89 c8 mov %ecx,%eax 9: f0 0f b1 13 lock cmpxchg %edx,(%rbx) d: 74 c0 je 0xffffffffffffffcf f: eb e2 jmp 0xfffffffffffffff3 11: 89 c6 mov %eax,%esi 13: 48 89 ef mov %rbp,%rdi [ 151.131871] RSP: 0018:ffffc900011f7b98 EFLAGS: 00000006 [ 151.131872] RAX: 0000000000000300 RBX: ffffc90001372ff8 RCX: 0000000000000100 [ 151.131872] RDX: 00000000000000ff RSI: ffffffff827ea0a3 RDI: ffffffff82886a66 [ 151.131873] RBP: ffffc90001372ffc R08: ffff888107864160 R09: 00000000777cbcf6 [ 151.131873] R10: ffff888107863300 R11: 000000006777cbcf R12: ffffc90001372ff8 [ 151.131874] R13: ffffc90001369000 R14: ffffc90001372fb8 R15: ffffc90001369170 [ 151.131874] FS: 00007f71d5f09640(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000 [ 151.131875] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.131876] CR2: 00007f71d5f08ef8 CR3: 0000000104774002 CR4: 0000000000772ee0 [ 151.131877] PKRU: 55555554 [ 151.131878] Call Trace: [ 151.131879] <TASK> [ 151.131880] do_raw_write_lock (./include/asm-generic/qrwlock.h:101 kernel/locking/spinlock_debug.c:210) [ 151.131883] kvm_gpc_refresh (arch/x86/kvm/../../../virt/kvm/pfncache.c:262) kvm [ 151.131907] ? kvm_gpc_activate (./include/linux/spinlock.h:390 arch/x86/kvm/../../../virt/kvm/pfncache.c:375) kvm [ 151.131925] kvm_xen_hvm_set_attr (arch/x86/kvm/xen.c:51 arch/x86/kvm/xen.c:464) kvm [ 151.131950] kvm_arch_vm_ioctl (arch/x86/kvm/x86.c:6883) kvm [ 151.131971] ? __lock_acquire (kernel/locking/lockdep.c:4553 kernel/locking/lockdep.c:5007) [ 151.131973] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4814) kvm [ 151.131991] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.131993] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688) [ 151.131995] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856) [ 151.131997] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [ 151.131998] ? lock_is_held_type (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5710) [ 151.132000] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132000] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132002] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132003] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132004] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132005] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132006] ? do_syscall_64 (arch/x86/entry/common.c:87) [ 151.132007] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4383) [ 151.132008] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [ 151.132009] RIP: 0033:0x7f71d6152c6b [ 151.132011] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d b5 b1 1b 00 mov 0x1bb1b5(%rip),%rcx # 0x1bb1bf a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 10 00 00 00 mov $0x10,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb1bf 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d 85 b1 1b 00 mov 0x1bb185(%rip),%rcx # 0x1bb195 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 151.132011] RSP: 002b:00007f71d5f08da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 151.132012] RAX: ffffffffffffffda RBX: 0000000001f2b2a0 RCX: 00007f71d6152c6b [ 151.132013] RDX: 00007f71d5f08db0 RSI: 000000004048aec9 RDI: 0000000000000004 [ 151.132013] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff7c716b3f [ 151.132013] R10: 00007f71d611c948 R11: 0000000000000246 R12: 00007f71d5f09640 [ 151.132014] R13: 0000000000000004 R14: 00007f71d61b3550 R15: 0000000000000000 [ 151.132016] </TASK> virt/kvm/pfncache.c | 2 ++ 1 file changed, 2 insertions(+)