From patchwork Tue Sep 17 08:16:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11148181 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BCA62912 for ; Tue, 17 Sep 2019 08:16:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 93B0720678 for ; Tue, 17 Sep 2019 08:16:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XMfUu4yb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730223AbfIQIQe (ORCPT ); Tue, 17 Sep 2019 04:16:34 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:40142 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725798AbfIQIQe (ORCPT ); Tue, 17 Sep 2019 04:16:34 -0400 Received: by mail-pl1-f195.google.com with SMTP id d22so1178444pll.7; Tue, 17 Sep 2019 01:16:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=lZpOV5nkYn4l+QDXD4wMM0QQ6pVGsg20H5ZmBWn5rFY=; b=XMfUu4ybyUbjaSQkc+G9xwv+FR7+SYPvEL9Au1fqvVxr5h+D75ZPor60PYddm3kRFN Nl17VEv+RnGNOjmQGn2murgOaEzjIqEAPfvyQIwJLfb7UKfl75QEET7wnkAsxhXl5reZ 0dcBq4BSZBzBOm5xAZkv4cPm+qZBbc3G2f0zm/g4NCObOOIZTXCeHOaTXEkdY2twPdKb gagOsiBm0jYDskJ59NLZDdbeq/ZzDedazC90wFT+oFEHxo8V0cj3zwm7jD4GYXChgCB3 flHvSawC/l/Qv9DvXaRm0Prx4z/qvrA5CZwTO3amxCGo/fekS2MpplfH1kyQAo2HL0g+ YeHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=lZpOV5nkYn4l+QDXD4wMM0QQ6pVGsg20H5ZmBWn5rFY=; b=ccy85uHNQ/ZSrKqkJKMCGBj22fUjnqHUiT7Rz9SEqDtN3EZJEpBfBM58dmNMkp8YKF a10ATPiRQ5hU6juW6pAqBjTdNrtEzTysagWwd6eL3pySN/Qh9Bf8ADoFWMvcpAYhsH2E gEmBt4bpCq4ymVJH5vt5PBoQwRboHMBu6/XrZ99OAB8oxlN1okPvh+9jp8WJPkxIy9PA svjUmAFrVyinz6eKKMbX53BB1lu81ubimKP78BCTLSaRBHs4tO5IaENkvRiDDDoB2Srw pJrSokpGuTnIkRq3oYweePdwTLoUoyBzDpBItXGos0cR2F4VO9UvjXdhFNuPH8kjCVkG zY8Q== X-Gm-Message-State: APjAAAXARhfHbawggQW+8foBaVNlf/5ZYu3GntSipnffpR0QU+66+jbR IhX7xaOzKxWEFhgm7SHrnXQY577p X-Google-Smtp-Source: APXvYqzNMalAy4NbEJMNfObcCqA9c0UoX5pztAHwrAbtwPMPP4mPFTKD26DkYq65IQVyqtWa82QGOQ== X-Received: by 2002:a17:902:8546:: with SMTP id d6mr2429121plo.170.1568708193274; Tue, 17 Sep 2019 01:16:33 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.123]) by smtp.googlemail.com with ESMTPSA id j10sm1924142pfh.137.2019.09.17.01.16.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 17 Sep 2019 01:16:32 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , stable@vger.kernel.org Subject: [PATCH 1/3] KVM: Fix coalesced mmio ring buffer out-of-bounds access Date: Tue, 17 Sep 2019 16:16:24 +0800 Message-Id: <1568708186-20260-1-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wanpeng Li Reported by syzkaller: #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 403c01067 P4D 403c01067 PUD 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 12564 Comm: a.out Tainted: G OE 5.3.0-rc4+ #4 RIP: 0010:coalesced_mmio_write+0xcc/0x130 [kvm] Call Trace: __kvm_io_bus_write+0x91/0xe0 [kvm] kvm_io_bus_write+0x79/0xf0 [kvm] write_mmio+0xae/0x170 [kvm] emulator_read_write_onepage+0x252/0x430 [kvm] emulator_read_write+0xcd/0x180 [kvm] emulator_write_emulated+0x15/0x20 [kvm] segmented_write+0x59/0x80 [kvm] writeback+0x113/0x250 [kvm] x86_emulate_insn+0x78c/0xd80 [kvm] x86_emulate_instruction+0x386/0x7c0 [kvm] kvm_mmu_page_fault+0xf9/0x9e0 [kvm] handle_ept_violation+0x10a/0x220 [kvm_intel] vmx_handle_exit+0xbe/0x6b0 [kvm_intel] vcpu_enter_guest+0x4dc/0x18d0 [kvm] kvm_arch_vcpu_ioctl_run+0x407/0x660 [kvm] kvm_vcpu_ioctl+0x3ad/0x690 [kvm] do_vfs_ioctl+0xa2/0x690 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x74/0x720 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0010:coalesced_mmio_write+0xcc/0x130 [kvm] Both the coalesced_mmio ring buffer indexs ring->first and ring->last are bigger than KVM_COALESCED_MMIO_MAX from the testcase, array out-of-bounds access triggers by ring->coalesced_mmio[ring->last].phys_addr = addr; assignment. This patch fixes it by mod indexs by KVM_COALESCED_MMIO_MAX. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=134b2826a00000 Reported-by: syzbot+983c866c3dd6efa3662a@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li --- virt/kvm/coalesced_mmio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index 5294abb..cff1ec9 100644 --- a/virt/kvm/coalesced_mmio.c +++ b/virt/kvm/coalesced_mmio.c @@ -73,6 +73,8 @@ static int coalesced_mmio_write(struct kvm_vcpu *vcpu, spin_lock(&dev->kvm->ring_lock); + ring->first = ring->first % KVM_COALESCED_MMIO_MAX; + ring->last = ring->last % KVM_COALESCED_MMIO_MAX; if (!coalesced_mmio_has_room(dev)) { spin_unlock(&dev->kvm->ring_lock); return -EOPNOTSUPP; From patchwork Tue Sep 17 08:16:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11148183 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3F1B1599 for ; Tue, 17 Sep 2019 08:16:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B9A1621670 for ; Tue, 17 Sep 2019 08:16:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lUYe/HN8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732420AbfIQIQi (ORCPT ); Tue, 17 Sep 2019 04:16:38 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:38983 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725798AbfIQIQh (ORCPT ); Tue, 17 Sep 2019 04:16:37 -0400 Received: by mail-pf1-f196.google.com with SMTP id i1so1689712pfa.6; Tue, 17 Sep 2019 01:16:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=flYRSMt8pIOAP3TA8EAVhdOF1+uRuM0MoI3dzcIAOic=; b=lUYe/HN8N3La1GQJ554woX+0srWbLV+S94trrY+6GwRi+jUkdvybgbHGxOeCa7k2Op NBOEUxvOVzshmU394xTkfz5PCGSmBdGZaw4OUeMorK6sfdN/p9YfRla/dZ/eommEb50S XFoZnqEbH84ddJC/KndIgnGvscspqD0VCEquGKxTNWfxEUEUQ/YGK8nzW9yC4Wehp6jC O9u0Zhoy1PMRAswOo0GXMeRi/Wbu5pobkKPwmZrG0IXQ9eivymKXRo99Joz4fjSavyAC VZqRKTGZdX9ruHK2RmPi6uoKrUiD3NI+vVqhdntCGqaXMnG+iT99OP6IlReYyNtNFi36 BJ5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=flYRSMt8pIOAP3TA8EAVhdOF1+uRuM0MoI3dzcIAOic=; b=rj8o9wOgJDq4EiSln2IK2jmPw+yf4aVL3MvNObhpZh+R805p7btuwp8A/ja+r4N0L7 4aC7W5FxCQd7hQx0FHezNCltd32Ky+dSUAFTHxGcUO3rb+LE+FHpnk0LVWWnXrTblPUw I37eZiYOQlfnVQE8z1KDTyR2kszbh5aEpBvYQh+8H9bc2Wsx1O0UPh7f5zeJ8WZEXr8N AQqAzyJHvahVLXh7eyKYTZuhfisqNcB423A3A6n/fr8K1g7cBOf7WujEmr+AS4+zt3zR ZYdAFrV2bcCz16GMugjt9IePbxP9RgBKytHd0s/ybbKcElQb4VMpHZ0WxthTk9UtcPlS gO+g== X-Gm-Message-State: APjAAAVRGPk6nQkajipNo4tkbwmpAHNVdia1Bbu6ysEr5N9hmIqupcKC llKn4E71mHlgIHuADqdihi7CzOad X-Google-Smtp-Source: APXvYqz3ie3yVwYKkCr/8naCaiieW3a+kAzHMAbDb6B2dHhP8we0iuGsW39wpjQ+/OgbzMnP9231Hg== X-Received: by 2002:a63:d301:: with SMTP id b1mr2127747pgg.379.1568708196244; Tue, 17 Sep 2019 01:16:36 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.123]) by smtp.googlemail.com with ESMTPSA id j10sm1924142pfh.137.2019.09.17.01.16.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 17 Sep 2019 01:16:35 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , stable@vger.kernel.org Subject: [PATCH v2 2/3] KVM: X86: Fix userspace set broken combinations of CPUID and CR4 Date: Tue, 17 Sep 2019 16:16:25 +0800 Message-Id: <1568708186-20260-2-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568708186-20260-1-git-send-email-wanpengli@tencent.com> References: <1568708186-20260-1-git-send-email-wanpengli@tencent.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wanpeng Li Reported by syzkaller: WARNING: CPU: 0 PID: 6544 at /home/kernel/data/kvm/arch/x86/kvm//vmx/vmx.c:4689 handle_desc+0x37/0x40 [kvm_intel] CPU: 0 PID: 6544 Comm: a.out Tainted: G OE 5.3.0-rc4+ #4 RIP: 0010:handle_desc+0x37/0x40 [kvm_intel] Call Trace: vmx_handle_exit+0xbe/0x6b0 [kvm_intel] vcpu_enter_guest+0x4dc/0x18d0 [kvm] kvm_arch_vcpu_ioctl_run+0x407/0x660 [kvm] kvm_vcpu_ioctl+0x3ad/0x690 [kvm] do_vfs_ioctl+0xa2/0x690 ksys_ioctl+0x6d/0x80 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x74/0x720 entry_SYSCALL_64_after_hwframe+0x49/0xbe When CR4.UMIP is set, guest should have UMIP cpuid flag. Current kvm set_sregs function doesn't have such check when userspace inputs sregs values. SECONDARY_EXEC_DESC is enabled on writes to CR4.UMIP in vmx_set_cr4 though guest doesn't have UMIP cpuid flag. The testcast triggers handle_desc warning when executing ltr instruction since guest architectural CR4 doesn't set UMIP. This patch fixes it by adding valid CR4 and CPUID combination checking in __set_sregs. syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=138efb99600000 Reported-by: syzbot+0f1819555fbdce992df9@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li --- arch/x86/kvm/x86.c | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f7cfd8e..cafb4d4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -884,34 +884,42 @@ int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) } EXPORT_SYMBOL_GPL(kvm_set_xcr); -int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) +static int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { - unsigned long old_cr4 = kvm_read_cr4(vcpu); - unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE | - X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE; - - if (cr4 & CR4_RESERVED_BITS) - return 1; - if (!guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) && (cr4 & X86_CR4_OSXSAVE)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_SMEP) && (cr4 & X86_CR4_SMEP)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_SMAP) && (cr4 & X86_CR4_SMAP)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_FSGSBASE) && (cr4 & X86_CR4_FSGSBASE)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_PKU) && (cr4 & X86_CR4_PKE)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_LA57) && (cr4 & X86_CR4_LA57)) - return 1; + return -EINVAL; if (!guest_cpuid_has(vcpu, X86_FEATURE_UMIP) && (cr4 & X86_CR4_UMIP)) + return -EINVAL; + + return 0; +} + +int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) +{ + unsigned long old_cr4 = kvm_read_cr4(vcpu); + unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE | + X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE; + + if (cr4 & CR4_RESERVED_BITS) + return 1; + + if (kvm_valid_cr4(vcpu, cr4)) return 1; if (is_long_mode(vcpu)) { @@ -8675,7 +8683,8 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) struct desc_ptr dt; int ret = -EINVAL; - if (kvm_valid_sregs(vcpu, sregs)) + if (kvm_valid_sregs(vcpu, sregs) || + kvm_valid_cr4(vcpu, sregs->cr4)) goto out; apic_base_msr.data = sregs->apic_base; From patchwork Tue Sep 17 08:16:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 11148185 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 727611747 for ; Tue, 17 Sep 2019 08:16:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 46E462189D for ; Tue, 17 Sep 2019 08:16:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aY+iFmHY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387401AbfIQIQl (ORCPT ); Tue, 17 Sep 2019 04:16:41 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:44219 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725798AbfIQIQk (ORCPT ); Tue, 17 Sep 2019 04:16:40 -0400 Received: by mail-pl1-f193.google.com with SMTP id k24so347723pll.11; Tue, 17 Sep 2019 01:16:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mgukzCtnclCW/hH0uFMjnqXIijt7Mvy5DsHXeaPWECE=; b=aY+iFmHYIF6t7XIv33dW0IGxufDOC4JFNAtk94pzw3JQrwBFs9vJdyD6pJjIBE2LKR O1EecKgPXResbqetQCjLqch20H+MWmr+FYQHtg1NAC9/jOi6902463MXTIqk822fwbHd xbYLoDlLJdKCiFb3Zndd60LWYkFgf8IfTWkqxPR05TDyExYQJ2wxTncnHEZDsZz301oY z8SlDDsZFLBSEMQIYHj8Zlos6PYzC6UcDlAbYC8tzSt23STldWChc27x5yZAJPL9fbpI RK82f14lTEwSfz1jhcSv7I+uJUMB9xVc4P3ApKhnOp0d7MKrPpFpPfkLrnmn8MkTGaNL ABAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mgukzCtnclCW/hH0uFMjnqXIijt7Mvy5DsHXeaPWECE=; b=kyDojjPeZVmDH1HWh2r/PZ6eKjy/6RGA+wdyGvApCTquGivRFB0L5/1lLHGz44REH0 8ignW2OI/IlpXn053kTFVtV6ewhpVl1OfoYJ9iFL8YuzBwbIWQ88tQx0PIUgEOAxRqoC xfcVfy1dj5yaHhzgQfjlVmCgsr81aZ3KGRaTzy90kEMawIOFUvBGZFRJFDPAIOdfJDGX cSFaupI2RNRPJhxUTNTK/j0d3bugsyPg+3Pq8HTaAio5zIIoRvjVjEus+XKQBVQLTygT Jw1WJ0GSsRkn7dwzEhcOma2vhh2muiPViguv4eFbfHx+SKEapbBZGDHoWkkrSxha+qSA dfjg== X-Gm-Message-State: APjAAAVqpJg1hJlMmw4ic6+FpHEEStyKp28+tuehGlcp8VxOTDoxUczS w/4bciDWePZORX6yGjPtiK4oW1Ti X-Google-Smtp-Source: APXvYqzGqOXY+sIUFWupOkrIA0xrZ4XOZ746IgCkW0V7c3zoO9XbpqMSLA+PA/hpRbm2meAV2xArtw== X-Received: by 2002:a17:902:9a92:: with SMTP id w18mr2345703plp.255.1568708199199; Tue, 17 Sep 2019 01:16:39 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.123]) by smtp.googlemail.com with ESMTPSA id j10sm1924142pfh.137.2019.09.17.01.16.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 17 Sep 2019 01:16:38 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel Subject: [PATCH v5 3/3] KVM: LAPIC: Tune lapic_timer_advance_ns smoothly Date: Tue, 17 Sep 2019 16:16:26 +0800 Message-Id: <1568708186-20260-3-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568708186-20260-1-git-send-email-wanpengli@tencent.com> References: <1568708186-20260-1-git-send-email-wanpengli@tencent.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wanpeng Li Filter out drastic fluctuation and random fluctuation, remove timer_advance_adjust_done altogether, the adjustment would be continuous. Signed-off-by: Wanpeng Li --- arch/x86/kvm/lapic.c | 28 ++++++++++++++-------------- arch/x86/kvm/lapic.h | 1 - 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index dbbe478..323bdca 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -65,7 +65,9 @@ #define APIC_BROADCAST 0xFF #define X2APIC_BROADCAST 0xFFFFFFFFul -#define LAPIC_TIMER_ADVANCE_ADJUST_DONE 100 +static bool dynamically_adjust_timer_advance __read_mostly; +#define LAPIC_TIMER_ADVANCE_ADJUST_MIN 100 +#define LAPIC_TIMER_ADVANCE_ADJUST_MAX 5000 #define LAPIC_TIMER_ADVANCE_ADJUST_INIT 1000 /* step-by-step approximation to mitigate fluctuation */ #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8 @@ -1485,26 +1487,25 @@ static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu, u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns; u64 ns; + /* Do not adjust for tiny fluctuations or large random spikes. */ + if (abs(advance_expire_delta) > LAPIC_TIMER_ADVANCE_ADJUST_MAX || + abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_MIN) + return; + /* too early */ if (advance_expire_delta < 0) { ns = -advance_expire_delta * 1000000ULL; do_div(ns, vcpu->arch.virtual_tsc_khz); - timer_advance_ns -= min((u32)ns, - timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP); + timer_advance_ns -= ns/LAPIC_TIMER_ADVANCE_ADJUST_STEP; } else { /* too late */ ns = advance_expire_delta * 1000000ULL; do_div(ns, vcpu->arch.virtual_tsc_khz); - timer_advance_ns += min((u32)ns, - timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP); + timer_advance_ns += ns/LAPIC_TIMER_ADVANCE_ADJUST_STEP; } - if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE) - apic->lapic_timer.timer_advance_adjust_done = true; - if (unlikely(timer_advance_ns > 5000)) { + if (unlikely(timer_advance_ns > LAPIC_TIMER_ADVANCE_ADJUST_MAX)) timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT; - apic->lapic_timer.timer_advance_adjust_done = false; - } apic->lapic_timer.timer_advance_ns = timer_advance_ns; } @@ -1524,7 +1525,7 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu) if (guest_tsc < tsc_deadline) __wait_lapic_expire(vcpu, tsc_deadline - guest_tsc); - if (unlikely(!apic->lapic_timer.timer_advance_adjust_done)) + if (dynamically_adjust_timer_advance) adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta); } @@ -2302,13 +2303,12 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns) apic->lapic_timer.timer.function = apic_timer_fn; if (timer_advance_ns == -1) { apic->lapic_timer.timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT; - apic->lapic_timer.timer_advance_adjust_done = false; + dynamically_adjust_timer_advance = true; } else { apic->lapic_timer.timer_advance_ns = timer_advance_ns; - apic->lapic_timer.timer_advance_adjust_done = true; + dynamically_adjust_timer_advance = false; } - /* * APIC is created enabled. This will prevent kvm_lapic_set_base from * thinking that APIC state has changed. diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index 50053d2..2aad7e2 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -35,7 +35,6 @@ struct kvm_timer { s64 advance_expire_delta; atomic_t pending; /* accumulated triggered timers */ bool hv_timer_in_use; - bool timer_advance_adjust_done; }; struct kvm_lapic {