Message ID | 1457729240-3846-1-git-send-email-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 03/12/2016 04:47 AM, David Matlack wrote: > I have not been able to trigger this bug on Linux 4.3, and suspect > it is due to this commit from Linux 4.2: > > 653f52c kvm,x86: load guest FPU context more eagerly > > With this commit, as long as the host is using eagerfpu, the guest's > fpu is always loaded just before the guest's xcr0 (vcpu->fpu_active > is always 1 in the following snippet): > > 6569 if (vcpu->fpu_active) > 6570 kvm_load_guest_fpu(vcpu); > 6571 kvm_load_guest_xcr0(vcpu); > > When the guest's fpu is loaded, irq_fpu_usable() returns false. Er, i did not see that commit introduced this change. > > We've included our workaround for this bug, which applies to Linux 3.11. > It does not apply cleanly to HEAD since the fpu subsystem was refactored > in Linux 4.2. While the latest kernel does not look vulnerable, we may > want to apply a fix to the vulnerable stable kernels. Is the latest kvm safe if we use !eager fpu? Under this case, kvm_load_guest_fpu() is not called for every single VM-enter, that means kernel will use guest's xcr0 to save/restore XSAVE area. Maybe a simpler fix is just calling __kernel_fpu_begin() when the CPU switches to vCPU and reverts it when the vCPU is scheduled out or returns to userspace. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 14, 2016 at 12:46 AM, Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote: > > > On 03/12/2016 04:47 AM, David Matlack wrote: > >> I have not been able to trigger this bug on Linux 4.3, and suspect >> it is due to this commit from Linux 4.2: >> >> 653f52c kvm,x86: load guest FPU context more eagerly >> >> With this commit, as long as the host is using eagerfpu, the guest's >> fpu is always loaded just before the guest's xcr0 (vcpu->fpu_active >> is always 1 in the following snippet): >> >> 6569 if (vcpu->fpu_active) >> 6570 kvm_load_guest_fpu(vcpu); >> 6571 kvm_load_guest_xcr0(vcpu); >> >> When the guest's fpu is loaded, irq_fpu_usable() returns false. > > > Er, i did not see that commit introduced this change. > >> >> We've included our workaround for this bug, which applies to Linux 3.11. >> It does not apply cleanly to HEAD since the fpu subsystem was refactored >> in Linux 4.2. While the latest kernel does not look vulnerable, we may >> want to apply a fix to the vulnerable stable kernels. > > > Is the latest kvm safe if we use !eager fpu? Yes I believe so. When !eagerfpu, interrupted_kernel_fpu_idle() returns "!current->thread.fpu.fpregs_active && (read_cr0() & X86_CR0_TS)". This should ensure the interrupt handler never does XSAVE/XRSTOR with the guest's xcr0. > Under this case, > kvm_load_guest_fpu() > is not called for every single VM-enter, that means kernel will use guest's > xcr0 to > save/restore XSAVE area. > > Maybe a simpler fix is just calling __kernel_fpu_begin() when the CPU > switches > to vCPU and reverts it when the vCPU is scheduled out or returns to > userspace. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/16/2016 03:01 AM, David Matlack wrote: > On Mon, Mar 14, 2016 at 12:46 AM, Xiao Guangrong > <guangrong.xiao@linux.intel.com> wrote: >> >> >> On 03/12/2016 04:47 AM, David Matlack wrote: >> >>> I have not been able to trigger this bug on Linux 4.3, and suspect >>> it is due to this commit from Linux 4.2: >>> >>> 653f52c kvm,x86: load guest FPU context more eagerly >>> >>> With this commit, as long as the host is using eagerfpu, the guest's >>> fpu is always loaded just before the guest's xcr0 (vcpu->fpu_active >>> is always 1 in the following snippet): >>> >>> 6569 if (vcpu->fpu_active) >>> 6570 kvm_load_guest_fpu(vcpu); >>> 6571 kvm_load_guest_xcr0(vcpu); >>> >>> When the guest's fpu is loaded, irq_fpu_usable() returns false. >> >> >> Er, i did not see that commit introduced this change. >> >>> >>> We've included our workaround for this bug, which applies to Linux 3.11. >>> It does not apply cleanly to HEAD since the fpu subsystem was refactored >>> in Linux 4.2. While the latest kernel does not look vulnerable, we may >>> want to apply a fix to the vulnerable stable kernels. >> >> >> Is the latest kvm safe if we use !eager fpu? > > Yes I believe so. When !eagerfpu, interrupted_kernel_fpu_idle() > returns "!current->thread.fpu.fpregs_active && (read_cr0() & > X86_CR0_TS)". This should ensure the interrupt handler never does > XSAVE/XRSTOR with the guest's xcr0. interrupted_kernel_fpu_idle() returns true if KVM-based hypervisor (e.g. QEMU) is not using fpu.?That can not stop handler using fpu. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 15, 2016 at 8:43 PM, Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote: > > > On 03/16/2016 03:01 AM, David Matlack wrote: >> >> On Mon, Mar 14, 2016 at 12:46 AM, Xiao Guangrong >> <guangrong.xiao@linux.intel.com> wrote: >>> >>> >>> >>> On 03/12/2016 04:47 AM, David Matlack wrote: >>> >>>> I have not been able to trigger this bug on Linux 4.3, and suspect >>>> it is due to this commit from Linux 4.2: >>>> >>>> 653f52c kvm,x86: load guest FPU context more eagerly >>>> >>>> With this commit, as long as the host is using eagerfpu, the guest's >>>> fpu is always loaded just before the guest's xcr0 (vcpu->fpu_active >>>> is always 1 in the following snippet): >>>> >>>> 6569 if (vcpu->fpu_active) >>>> 6570 kvm_load_guest_fpu(vcpu); >>>> 6571 kvm_load_guest_xcr0(vcpu); >>>> >>>> When the guest's fpu is loaded, irq_fpu_usable() returns false. >>> >>> >>> >>> Er, i did not see that commit introduced this change. >>> >>>> >>>> We've included our workaround for this bug, which applies to Linux 3.11. >>>> It does not apply cleanly to HEAD since the fpu subsystem was refactored >>>> in Linux 4.2. While the latest kernel does not look vulnerable, we may >>>> want to apply a fix to the vulnerable stable kernels. >>> >>> >>> >>> Is the latest kvm safe if we use !eager fpu? >> >> >> Yes I believe so. When !eagerfpu, interrupted_kernel_fpu_idle() >> returns "!current->thread.fpu.fpregs_active && (read_cr0() & >> X86_CR0_TS)". This should ensure the interrupt handler never does >> XSAVE/XRSTOR with the guest's xcr0. > > > > interrupted_kernel_fpu_idle() returns true if KVM-based hypervisor (e.g. > QEMU) > is not using fpu.?That can not stop handler using fpu. Why is it safe to rely on interrupted_kernel_fpu_idle? That function is for interrupts, but is there any reason that KVM can't be preempted (or explicitly schedule) with XCR0 having some funny value? --Andy
On Tue, Mar 15, 2016 at 8:43 PM, Xiao Guangrong <guangrong.xiao@linux.intel.com> wrote: > > > On 03/16/2016 03:01 AM, David Matlack wrote: >> >> On Mon, Mar 14, 2016 at 12:46 AM, Xiao Guangrong >> <guangrong.xiao@linux.intel.com> wrote: >>> >>> On 03/12/2016 04:47 AM, David Matlack wrote: >>> >>>> I have not been able to trigger this bug on Linux 4.3, and suspect >>>> it is due to this commit from Linux 4.2: >>>> >>>> 653f52c kvm,x86: load guest FPU context more eagerly >>>> >>>> With this commit, as long as the host is using eagerfpu, the guest's >>>> fpu is always loaded just before the guest's xcr0 (vcpu->fpu_active >>>> is always 1 in the following snippet): >>>> >>>> 6569 if (vcpu->fpu_active) >>>> 6570 kvm_load_guest_fpu(vcpu); >>>> 6571 kvm_load_guest_xcr0(vcpu); >>>> >>>> When the guest's fpu is loaded, irq_fpu_usable() returns false. >>> >>> Er, i did not see that commit introduced this change. >>> >>>> >>>> We've included our workaround for this bug, which applies to Linux 3.11. >>>> It does not apply cleanly to HEAD since the fpu subsystem was refactored >>>> in Linux 4.2. While the latest kernel does not look vulnerable, we may >>>> want to apply a fix to the vulnerable stable kernels. >>> >>> Is the latest kvm safe if we use !eager fpu? >> >> Yes I believe so. When !eagerfpu, interrupted_kernel_fpu_idle() >> returns "!current->thread.fpu.fpregs_active && (read_cr0() & >> X86_CR0_TS)". This should ensure the interrupt handler never does >> XSAVE/XRSTOR with the guest's xcr0. > > interrupted_kernel_fpu_idle() returns true if KVM-based hypervisor (e.g. > QEMU) > is not using fpu.?That can not stop handler using fpu. You are correct, the interrupt handler can still use the fpu. But kernel_fpu_{begin,end} will not execute XSAVE / XRSTOR. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 15, 2016 at 8:48 PM, Andy Lutomirski <luto@amacapital.net> wrote: > > Why is it safe to rely on interrupted_kernel_fpu_idle? That function > is for interrupts, but is there any reason that KVM can't be preempted > (or explicitly schedule) with XCR0 having some funny value? KVM restores the host's xcr0 in the sched-out preempt notifier and prior to returning to userspace. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/irq_fpu_stress.c b/irq_fpu_stress.c new file mode 100644 index 0000000..faa6ba3 --- /dev/null +++ b/irq_fpu_stress.c @@ -0,0 +1,95 @@ +/* + * For the duration of time this module is loaded, this module fires + * IPIs at all CPUs and tries to use the FPU on that CPU in irq + * context. + */ +#include <linux/futex.h> +#include <linux/module.h> +#include <linux/moduleparam.h> +#include <linux/kprobes.h> +#include <linux/signal.h> +#include <linux/debugfs.h> +#include <linux/fs.h> +#include <linux/hardirq.h> +#include <linux/workqueue.h> + +#include <asm/uaccess.h> +#include <asm/bug.h> +#include <asm/fpu/api.h> + +MODULE_LICENSE("GPL"); + +#define MODNAME "irq_fpu_stress" +#undef pr_fmt +#define pr_fmt(fmt) MODNAME": "fmt + +struct workqueue_struct *work_queue; +struct work_struct work; + +struct { + atomic_t irq_fpu_usable; + atomic_t irq_fpu_unusable; + unsigned long num_tests; +} stats; + +bool done; + +static void test_irq_fpu(void *info) +{ + BUG_ON(!in_interrupt()); + + if (irq_fpu_usable()) { + atomic_inc(&stats.irq_fpu_usable); + + kernel_fpu_begin(); + kernel_fpu_end(); + } else { + atomic_inc(&stats.irq_fpu_unusable); + } +} + +static void do_work(struct work_struct *w) +{ + pr_info("starting test\n"); + + stats.num_tests = 0; + atomic_set(&stats.irq_fpu_usable, 0); + atomic_set(&stats.irq_fpu_unusable, 0); + + while (!ACCESS_ONCE(done)) { + preempt_disable(); + smp_call_function_many( + cpu_online_mask, test_irq_fpu, NULL, 1 /* wait */); + preempt_enable(); + + stats.num_tests++; + + if (need_resched()) + schedule(); + } + + pr_info("finished test\n"); +} + +int init_module(void) +{ + work_queue = create_singlethread_workqueue(MODNAME); + + INIT_WORK(&work, do_work); + queue_work(work_queue, &work); + + return 0; +} + +void cleanup_module(void) +{ + ACCESS_ONCE(done) = true; + + flush_workqueue(work_queue); + destroy_workqueue(work_queue); + + pr_info("num_tests %lu, irq_fpu_usable %d, irq_fpu_unusable %d\n", + stats.num_tests, + atomic_read(&stats.irq_fpu_usable), + atomic_read(&stats.irq_fpu_unusable)); +} --- 8< ---