From patchwork Fri Nov 30 08:07:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 21BAF1057 for ; Fri, 30 Nov 2018 08:08:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1288D2E52B for ; Fri, 30 Nov 2018 08:08:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 063182F7AB; Fri, 30 Nov 2018 08:08:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21FD42E52B for ; Fri, 30 Nov 2018 08:08:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726987AbeK3TQi (ORCPT ); Fri, 30 Nov 2018 14:16:38 -0500 Received: from mga04.intel.com ([192.55.52.120]:36170 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726551AbeK3TQi (ORCPT ); Fri, 30 Nov 2018 14:16:38 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="96176417" Received: from linux.intel.com ([10.54.29.200]) by orsmga006.jf.intel.com with ESMTP; 30 Nov 2018 00:08:10 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 15B33580460; Fri, 30 Nov 2018 00:08:07 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 01/11] Documentation: Added EPT Subpage Protection Documentation. Date: Fri, 30 Nov 2018 16:07:52 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Zhang Yi --- Documentation/virtual/kvm/spp_design_kvm.txt | 275 +++++++++++++++++++++++++++ 1 file changed, 275 insertions(+) create mode 100644 Documentation/virtual/kvm/spp_design_kvm.txt diff --git a/Documentation/virtual/kvm/spp_design_kvm.txt b/Documentation/virtual/kvm/spp_design_kvm.txt new file mode 100644 index 0000000..8dc4530 --- /dev/null +++ b/Documentation/virtual/kvm/spp_design_kvm.txt @@ -0,0 +1,275 @@ +DRAFT: EPT-Based Sub-Page Protection (SPP) Design Doc for KVM +============================================================= + +1. Overview + +EPT-based Sub-Page Protection (SPP) capability to allow Virtual Machine +Monitors to specify write-protection for guest physical memory at a +sub-page (128 byte) granularity. When this capability is utilized, the +CPU enforces write-access permissions for sub-page regions of 4K pages +as specified by the VMM. + +2. Operation of SPP + +Sub-Page Protection Table (SPPT) is introduced to manage sub-page +write-access. + +SPPT is active when the "sub-page write protection" VM-execution control +is 1. SPPT looks up the guest physical addresses to derive a 64 bit +"sub-page permission" value containing sub-page write permissions. The +lookup from guest-physical addresses to the sub-page region permissions +is determined by a set of SPPT paging structures. + +When the "sub-page write protection" VM-execution control is 1, the SPPT +is used to lookup write permission bits for the 128 byte sub-page regions +containing in the 4KB guest physical page. EPT specifies the 4KB page +level privileges that software is allowed when accessing the guest +physical address, whereas SPPT defines the write permissions for software +at the 128 byte granularity regions within a 4KB page. Write accesses +prevented due to sub-page permissions looked up via SPPT are reported as +EPT violation VM exits. Similar to EPT, a logical processor uses SPPT to +lookup sub-page region write permissions for guest-physical addresses +only when those addresses are used to access memory. +______________________________________________________________________________ + +How SPP hardware works: +______________________________________________________________________________ + +Guest write access --> GPA --> Walk EPT --> EPT leaf entry -┐ +┌-----------------------------------------------------------┘ +└-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61) + | + └-> --> EPT legacy behavior + | + | + └-> --> if ept_leaf_entry.writable + | + └-> --> Ignore SPP + | + └-> --> GPA --> Walk SPP 4-level table--┐ + | +┌------------<----------get-the-SPPT-point-from-VMCS-filed-----<------┘ +| +Walk SPP L4E table +| +└┐--> entry misconfiguration ------------>----------┐<----------------┐ + | | | +else | | + | | | + | ┌------------------SPP VMexit<-----------------┘ | + | | | + | └-> exit_qualification & sppt_misconfig --> sppt misconfig | + | | | + | └-> exit_qualification & sppt_miss --> sppt miss | + └--┐ | + | | +walk SPPT L3E--┐--> if-entry-misconfiguration------------>------------┘ + | | + else | + | | + | | + walk SPPT L2E --┐--> if-entry-misconfiguration-------->-------┘ + | | + else | + | | + | | + walk SPPT L1E --┐-> if-entry-misconfiguration--->----┘ + | + else + | + └-> if sub-page writable + └-> allow, write access + └-> disallow, EPT violation +______________________________________________________________________________ + +3. Interfaces + +* Feature enabling + +Add "spp=on" to KVM module parameter to enable SPP feature, default is off. + +* Get/Set sub-page write access permission + +New KVM ioctl: + +`KVM_SUBPAGES_GET_ACCESS`: +Get sub-pages write access bitmap corresponding to given rang of continuous gfn. + +`KVM_SUBPAGES_SET_ACCESS` +Set sub-pages write access bitmap corresponding to given rang of continuous gfn. + +```c +/* for KVM_SUBPAGES_GET_ACCESS and KVM_SUBPAGES_SET_ACCESS */ +struct kvm_subpage_info { + __u64 gfn; + __u64 npages; /* number of 4K pages */ + __u64 *access_map; /* sub-page write-access bitmap array */ +}; + +#define KVM_SUBPAGES_GET_ACCESS _IOR(KVMIO, 0x49, struct kvm_subpage_info) +#define KVM_SUBPAGES_SET_ACCESS _IOW(KVMIO, 0x4a, struct kvm_subpage_info) +``` + +4. SPPT initialization + +* SPPT root page allocation + + SPPT is referenced via a 64-bit control field called "sub-page + protection table pointer" (SPPTP, encoding 0x2030) which contains a + 4K-align physical address. + + SPPT also has 4 level table as well as EPT. So, as EPT does, when KVM + loads mmu, we allocate a root page for SPPT L4 table. + +* EPT leaf entry SPP bit + + Set 0 to SPP bit to close SPP by default. + +5. Set/Get Sub-Page access bitmap for bunch of guest physical pages + +* To utilize SPP feature, system admin should Set a Sub-page access write via + SPP KVM ioctl `KVM_SUBPAGES_SET_ACCESS`, which will prepared the flowing things. + + (1.Got the corresponding EPT leaf entry via the guest physical address. + (2.If it is a 4K page frame, flag the bit 61 to enable subpage protection on this page. + (3.Setup spp page structure, the page structure format is list following. + + Format of the SPPT L4E, L3E, L2E: + | Bit | Contents | + | :----- | :------------------------------------------------------------------------| + | 0 | Valid entry when set; indicates whether the entry is present | + | 11:1 | Reserved (0) | + | N-1:12 | Physical address of 4KB aligned SPPT LX-1 Table referenced by this entry | + | 51:N | Reserved (0) | + | 63:52 | Reserved (0) | + Note: N is the physical address width supported by the processor. X is the page level + + Format of the SPPT L1E: + | Bit | Contents | + | :---- | :---------------------------------------------------------------- | + | 0+2i | Write permission for i-th 128 byte sub-page region. | + | 1+2i | Reserved (0). | + Note: `0<=i<=31` + + (4.Update the subpage info into memory slot structure. + +* Sub-page write access bitmap setting pseudo-code: + +```c +static int kvm_mmu_set_subpages(struct kvm_vcpu *vcpu, + struct kvm_subpage_info *spp_info) +{ + gfn_t *gfns = spp_info->gfns; + u64 *access_map = spp_info->access_map; + + sanity_check(); + + /* SPP works when the page is unwritable */ + if (set_ept_leaf_level_unwritable(gfn) == success) + + if (kvm_mmu_setup_spp_structure(gfn) == success) + + set_subpage_slot_info(access_map); + +} +``` + +User could get the subpage info via SPP KVM ioctl `KVM_SUBPAGES_GET_ACCESS`, +which from the memory slot structure corresponding the specify gpa. + +* Sub-page get subpage info pseudo-code: + +```c +static int kvm_mmu_get_subpages(struct kvm_vcpu *vcpu, + struct kvm_subpage_info *spp_info) +{ + gfn_t *gfns = spp_info->gfns; + + sanity_check(gfn); + spp_info = get_subpage_slot_info(gfn); +} + +``` + +5. SPPT-induced vmexits + +* SPP VM exits + +Accesses using guest physical addresses may cause VM exits due to a SPPT +Misconfiguration or a SPPT Miss. + +A SPPT Misconfiguration vmexit occurs when, in the course of translating +a guest physical address, the logical processor encounters a leaf EPT +paging-structure entry mapping a 4KB page, with SPP enabled, during the +SPPT lookup, a SPPT paging-structure entry contains an unsupported +value. + +A SPPT Miss vmexit occurs during the SPPT lookup there is no SPPT +misconfiguration but any level of SPPT paging-structure entries are not +present. + +NOTE. SPPT misconfigurations and SPPT miss can occur only due to an +attempt to write memory with a guest physical address. + +* EPT violation vmexits due to SPPT + +EPT violations due to memory write accesses disallowed due to sub-page +protection permissions specified in the SPPT are reported via EPT +violation VM exits. + +6. SPPT-induced vmexits handling + +```c +#define EXIT_REASON_SPP 66 + +static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { + ... + [EXIT_REASON_SPP] = handle_spp, + ... +}; +``` + +New exit qualification for SPPT-induced vmexits. + +| Bit | Contents | +| :---- | :---------------------------------------------------------------- | +| 10:0 | Reserved (0). | +| 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig. | +| 12 | NMI unblocking due to IRET | +| 63:13 | Reserved (0) | + +In addition to the exit qualification, Guest Linear Address and Guest +Physical Address fields will be reported. + +* SPPT miss and misconfiguration + +Allocate a page for the SPPT entry and set the entry correctly. + + +SPP VMexit handler Pseudo-code: +```c +static int handle_spp(kvm_vcpu *vcpu) +{ + exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + if (exit_qualification & SPP_EXIT_TYPE_BIT) { + /* SPPT Miss */ + /* We don't set SPP write access for the corresponding + * GPA, leave it unwritable, so no need to construct + * SPP table here. */ + } else { + /* SPPT Misconfig */ + vcpu->run->exit_reason = KVM_EXIT_UNKNOWN; + vcpu->run->hw.hardware_exit_reason = EXIT_REASON_SPP; + } + return 0; +} +``` + +* EPT violation vmexits due to SPPT + +While hardware walking the SPP page table, If the sub-page region write +permission bit is set, the write is allowed, else the write is disallowed +and results in an EPT violation. + +we need peek this case in EPT volition handler, and trigger a user-space +exit, return the write protected address(GPA) to user(qemu). From patchwork Fri Nov 30 08:08:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705917 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BFBBA14BD for ; Fri, 30 Nov 2018 08:08:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B38AA2E52B for ; Fri, 30 Nov 2018 08:08:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A4A792F7AB; Fri, 30 Nov 2018 08:08:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A4172E52B for ; Fri, 30 Nov 2018 08:08:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727071AbeK3TQw (ORCPT ); Fri, 30 Nov 2018 14:16:52 -0500 Received: from mga14.intel.com ([192.55.52.115]:12724 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727016AbeK3TQv (ORCPT ); Fri, 30 Nov 2018 14:16:51 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="279174483" Received: from linux.intel.com ([10.54.29.200]) by orsmga005.jf.intel.com with ESMTP; 30 Nov 2018 00:08:23 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id D340E580213; Fri, 30 Nov 2018 00:08:21 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 02/11] x86/cpufeature: Add intel Sub-Page Protection to CPU features Date: Fri, 30 Nov 2018 16:08:06 +0800 Message-Id: <0b6a870f960c656bc73bf0e0b9c678ee7f52998e.1543481993.git.yi.z.zhang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Adds reporting SPP capability from VMX Procbased MSR, according to the definition of hardware spec, bit 32 is the control of the SPP capability. Defined X86_FEATURE_SPP under intel X86 VT-x CPU features. Defined the X86_VMX_FEATURE_PROC_CTLS2_SPP in intel VMX MSR indicated features, And enable SPP capability by this MSR. Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/intel.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 28c4a50..e22567e 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -228,6 +228,7 @@ #define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */ #define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */ #define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */ +#define X86_FEATURE_SPP ( 8*32+ 5) /* Intel EPT-based Sub-Page Write Protection */ #define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer VMMCALL to VMCALL */ #define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */ diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index fc3c07f..b55156c 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -476,6 +476,7 @@ static void detect_vmx_virtcap(struct cpuinfo_x86 *c) #define X86_VMX_FEATURE_PROC_CTLS2_EPT 0x00000002 #define X86_VMX_FEATURE_PROC_CTLS2_VPID 0x00000020 #define x86_VMX_FEATURE_EPT_CAP_AD 0x00200000 +#define X86_VMX_FEATURE_PROC_CTLS2_SPP 0x00800000 u32 vmx_msr_low, vmx_msr_high, msr_ctl, msr_ctl2; u32 msr_vpid_cap, msr_ept_cap; @@ -486,6 +487,7 @@ static void detect_vmx_virtcap(struct cpuinfo_x86 *c) clear_cpu_cap(c, X86_FEATURE_EPT); clear_cpu_cap(c, X86_FEATURE_VPID); clear_cpu_cap(c, X86_FEATURE_EPT_AD); + clear_cpu_cap(c, X86_FEATURE_SPP); rdmsr(MSR_IA32_VMX_PROCBASED_CTLS, vmx_msr_low, vmx_msr_high); msr_ctl = vmx_msr_high | vmx_msr_low; @@ -509,6 +511,8 @@ static void detect_vmx_virtcap(struct cpuinfo_x86 *c) } if (msr_ctl2 & X86_VMX_FEATURE_PROC_CTLS2_VPID) set_cpu_cap(c, X86_FEATURE_VPID); + if (msr_ctl2 & X86_VMX_FEATURE_PROC_CTLS2_SPP) + set_cpu_cap(c, X86_FEATURE_SPP); } } From patchwork Fri Nov 30 08:08:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705919 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF7111057 for ; Fri, 30 Nov 2018 08:08:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2B582EDCE for ; Fri, 30 Nov 2018 08:08:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C66532EE36; Fri, 30 Nov 2018 08:08:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6E372EDCE for ; Fri, 30 Nov 2018 08:08:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727139AbeK3TRB (ORCPT ); Fri, 30 Nov 2018 14:17:01 -0500 Received: from mga03.intel.com ([134.134.136.65]:17648 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727016AbeK3TRB (ORCPT ); Fri, 30 Nov 2018 14:17:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="113902224" Received: from linux.intel.com ([10.54.29.200]) by orsmga002.jf.intel.com with ESMTP; 30 Nov 2018 00:08:32 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id AE7B0580213; Fri, 30 Nov 2018 00:08:30 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 03/11] KVM: VMX: Added VMX SPP feature flags and VM-Execution Controls. Date: Fri, 30 Nov 2018 16:08:15 +0800 Message-Id: <43d18efb9e731f45de01a3a40d0d325c44e5a201.1543481993.git.yi.z.zhang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add new secondary processor-based VM-execution control bit which defined as "sub-page write permission", same as VMX Procbased MSR, bit 23 is the enable bit of SPP. Also we introduced a enable_ept_spp parameter to control the SPP is ON/OFF, Set the default is OFF as we are on the way of enabling. Now SPP is active when the "Sub-page Write Protection" in Secondary VM-Execution Control is set and enable the kernel parameter by "spp=on". Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/vmx.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index ade0f15..2aa088f 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -78,6 +78,7 @@ #define SECONDARY_EXEC_RDSEED_EXITING 0x00010000 #define SECONDARY_EXEC_ENABLE_PML 0x00020000 #define SECONDARY_EXEC_XSAVES 0x00100000 +#define SECONDARY_EXEC_ENABLE_SPP 0x00800000 #define SECONDARY_EXEC_TSC_SCALING 0x02000000 #define PIN_BASED_EXT_INTR_MASK 0x00000001 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 4555077..f76d3fb 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -92,6 +92,9 @@ module_param_named(unrestricted_guest, static bool __read_mostly enable_ept_ad_bits = 1; module_param_named(eptad, enable_ept_ad_bits, bool, S_IRUGO); +static bool __read_mostly enable_ept_spp; +module_param_named(spp, enable_ept_spp, bool, S_IRUGO); + static bool __read_mostly emulate_invalid_guest_state = true; module_param(emulate_invalid_guest_state, bool, S_IRUGO); @@ -1941,6 +1944,11 @@ static inline bool cpu_has_vmx_pml(void) return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_ENABLE_PML; } +static inline bool cpu_has_vmx_ept_spp(void) +{ + return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_ENABLE_SPP; +} + static inline bool cpu_has_vmx_tsc_scaling(void) { return vmcs_config.cpu_based_2nd_exec_ctrl & @@ -4583,6 +4591,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) SECONDARY_EXEC_RDSEED_EXITING | SECONDARY_EXEC_RDRAND_EXITING | SECONDARY_EXEC_ENABLE_PML | + SECONDARY_EXEC_ENABLE_SPP | SECONDARY_EXEC_TSC_SCALING | SECONDARY_EXEC_ENABLE_VMFUNC | SECONDARY_EXEC_ENCLS_EXITING; @@ -6486,6 +6495,9 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx) if (!enable_pml) exec_control &= ~SECONDARY_EXEC_ENABLE_PML; + if (!enable_ept_spp) + exec_control &= ~SECONDARY_EXEC_ENABLE_SPP; + if (vmx_xsaves_supported()) { /* Exposing XSAVES only when XSAVE is exposed */ bool xsaves_enabled = @@ -7927,6 +7939,9 @@ static __init int hardware_setup(void) if (!cpu_has_vmx_unrestricted_guest() || !enable_ept) enable_unrestricted_guest = 0; + if (!cpu_has_vmx_ept_spp() || !enable_ept) + enable_ept_spp = 0; + if (!cpu_has_vmx_flexpriority()) flexpriority_enabled = 0; From patchwork Fri Nov 30 08:08:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705921 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5BF251057 for ; Fri, 30 Nov 2018 08:08:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4EA552EFF3 for ; Fri, 30 Nov 2018 08:08:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4044F2EFFC; Fri, 30 Nov 2018 08:08:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3E8D2EFF3 for ; Fri, 30 Nov 2018 08:08:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727194AbeK3TRI (ORCPT ); Fri, 30 Nov 2018 14:17:08 -0500 Received: from mga11.intel.com ([192.55.52.93]:64069 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727016AbeK3TRI (ORCPT ); Fri, 30 Nov 2018 14:17:08 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="117325021" Received: from linux.intel.com ([10.54.29.200]) by fmsmga002.fm.intel.com with ESMTP; 30 Nov 2018 00:08:40 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 958CA580213; Fri, 30 Nov 2018 00:08:38 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 04/11] KVM: VMX: Introduce the SPPTP and SPP page table. Date: Fri, 30 Nov 2018 16:08:23 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP SPPT has 4-level paging structure that is similar to EPT except L1E. The sub-page permission table is referenced via a 64-bit control field called Sub-Page Permission Table Pointer (SPPTP) which contains a 4K-aligned physical address. the index and encoding for this VMCS field is defined 0x2030 at this time The format of SPPTP is shown in below figure ------------------------------------------------------------------------- | Bit | Contents | | | | :-----------------------------------------------------------------------| | 11:0 | Reserved (0) | | N-1:12 | Physical address of 4KB aligned SPPT L4E Table | | 51:N | Reserved (0) | | 63:52 | Reserved (0) | ------------------------------------------------------------------------| Note: N is the physical address width supported by the processor. This patch introduced the Spp paging structures, which root page will created at kvm mmu page initialization. and free at mmu page free. Same as EPT page table, We initialized the SPPT, and write the SPPT point into VMCS field. Also we added a mmu page role type spp to distinguish it is a spp page or a EPT page. Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/include/asm/kvm_host.h | 4 +++- arch/x86/kvm/mmu.c | 33 ++++++++++++++++++++++++++++++++- 2 files changed, 35 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 55e51ff..46312b9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -270,7 +270,8 @@ union kvm_mmu_page_role { unsigned smap_andnot_wp:1; unsigned ad_disabled:1; unsigned guest_mode:1; - unsigned :6; + unsigned spp:1; + unsigned reserved:5; /* * This is left at the top of the word so that @@ -397,6 +398,7 @@ struct kvm_mmu { void (*update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 *spte, const void *pte); hpa_t root_hpa; + hpa_t sppt_root; union kvm_mmu_role mmu_role; u8 root_level; u8 shadow_root_level; diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cf5f572..d1f1fe1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2366,6 +2366,29 @@ static void clear_sp_write_flooding_count(u64 *spte) __clear_sp_write_flooding_count(sp); } +static struct kvm_mmu_page *kvm_mmu_get_spp_page(struct kvm_vcpu *vcpu, + gfn_t gfn, + unsigned int level) + +{ + struct kvm_mmu_page *sp; + union kvm_mmu_page_role role; + + role = vcpu->arch.mmu->mmu_role.base; + role.level = level; + role.direct = true; + role.spp = true; + + sp = kvm_mmu_alloc_page(vcpu, true); + sp->gfn = gfn; + sp->role = role; + hlist_add_head(&sp->hash_link, + &vcpu->kvm->arch.mmu_page_hash + [kvm_page_table_hashfn(gfn)]); + clear_page(sp->spt); + return sp; +} + static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gaddr, @@ -3509,6 +3532,9 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) { mmu_free_root_page(vcpu->kvm, &mmu->root_hpa, &invalid_list); + mmu_free_root_page(vcpu->kvm, &mmu->sppt_root, + &invalid_list); + } else { for (i = 0; i < 4; ++i) if (mmu->pae_root[i] != 0) @@ -3538,7 +3564,7 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn) static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) { - struct kvm_mmu_page *sp; + struct kvm_mmu_page *sp, *spp_sp; unsigned i; if (vcpu->arch.mmu->shadow_root_level >= PT64_ROOT_4LEVEL) { @@ -3549,9 +3575,13 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) } sp = kvm_mmu_get_page(vcpu, 0, 0, vcpu->arch.mmu->shadow_root_level, 1, ACC_ALL); + spp_sp = kvm_mmu_get_spp_page(vcpu, 0, + vcpu->arch.mmu->shadow_root_level); ++sp->root_count; + ++spp_sp->root_count; spin_unlock(&vcpu->kvm->mmu_lock); vcpu->arch.mmu->root_hpa = __pa(sp->spt); + vcpu->arch.mmu->sppt_root = __pa(spp_sp->spt); } else if (vcpu->arch.mmu->shadow_root_level == PT32E_ROOT_LEVEL) { for (i = 0; i < 4; ++i) { hpa_t root = vcpu->arch.mmu->pae_root[i]; @@ -4986,6 +5016,7 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu, bool reset_roots) uint i; vcpu->arch.mmu->root_hpa = INVALID_PAGE; + vcpu->arch.mmu->sppt_root = INVALID_PAGE; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) vcpu->arch.mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID; From patchwork Fri Nov 30 08:08:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705923 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA98714BD for ; Fri, 30 Nov 2018 08:08:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBCED2EFF3 for ; Fri, 30 Nov 2018 08:08:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BD0E52EFFC; Fri, 30 Nov 2018 08:08:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 579072EFF3 for ; Fri, 30 Nov 2018 08:08:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727208AbeK3TRS (ORCPT ); Fri, 30 Nov 2018 14:17:18 -0500 Received: from mga12.intel.com ([192.55.52.136]:50708 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726992AbeK3TRS (ORCPT ); Fri, 30 Nov 2018 14:17:18 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="117325047" Received: from linux.intel.com ([10.54.29.200]) by fmsmga002.fm.intel.com with ESMTP; 30 Nov 2018 00:08:49 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 65FA3580460; Fri, 30 Nov 2018 00:08:47 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 05/11] KVM: VMX: Write the SPPTP to VMCS area. Date: Fri, 30 Nov 2018 16:08:32 +0800 Message-Id: <11d0e5ff60c3e0cdac90872217299735e3ca8234.1543481993.git.yi.z.zhang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Same as EPT page table, We initialized the SPPT, and write the SPPT point into VMCS field. Signed-off-by: Zhang Yi --- arch/x86/include/asm/vmx.h | 2 ++ arch/x86/kvm/vmx.c | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 2aa088f..bd4ec8a 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -217,6 +217,8 @@ enum vmcs_field { XSS_EXIT_BITMAP_HIGH = 0x0000202D, ENCLS_EXITING_BITMAP = 0x0000202E, ENCLS_EXITING_BITMAP_HIGH = 0x0000202F, + SPPT_POINTER = 0x00002030, + SPPT_POINTER_HIGH = 0x00002031, TSC_MULTIPLIER = 0x00002032, TSC_MULTIPLIER_HIGH = 0x00002033, GUEST_PHYSICAL_ADDRESS = 0x00002400, diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f76d3fb..e96b4c7 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -600,6 +600,7 @@ struct __packed vmcs12 { u16 host_gs_selector; u16 host_tr_selector; u16 guest_pml_index; + u64 sppt_pointer; }; /* @@ -1158,6 +1159,7 @@ static const unsigned short vmcs_field_to_offset_table[] = { FIELD64(VMREAD_BITMAP, vmread_bitmap), FIELD64(VMWRITE_BITMAP, vmwrite_bitmap), FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap), + FIELD64(SPPT_POINTER, sppt_pointer), FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address), FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer), FIELD64(GUEST_IA32_DEBUGCTL, guest_ia32_debugctl), @@ -5348,11 +5350,17 @@ static u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa) return eptp; } +static inline u64 construct_spptp(unsigned long root_hpa) +{ + return root_hpa & PAGE_MASK; +} + static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) { struct kvm *kvm = vcpu->kvm; unsigned long guest_cr3; u64 eptp; + u64 spptp; guest_cr3 = cr3; if (enable_ept) { @@ -5375,6 +5383,13 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) ept_load_pdptrs(vcpu); } + if (vcpu->arch.mmu->sppt_root != INVALID_PAGE && + enable_ept_spp) { + spptp = construct_spptp(vcpu->arch.mmu->sppt_root); + vmcs_write64(SPPT_POINTER, spptp); + vmx_flush_tlb(vcpu, true); + } + vmcs_writel(GUEST_CR3, guest_cr3); } @@ -10505,6 +10520,8 @@ static void dump_vmcs(void) pr_err("PostedIntrVec = 0x%02x\n", vmcs_read16(POSTED_INTR_NV)); if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT)) pr_err("EPT pointer = 0x%016llx\n", vmcs_read64(EPT_POINTER)); + if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_SPP)) + pr_err("SPPT pointer = 0x%016llx\n", vmcs_read64(SPPT_POINTER)); n = vmcs_read32(CR3_TARGET_COUNT); for (i = 0; i + 1 < n; i += 4) pr_err("CR3 target%u=%016lx target%u=%016lx\n", From patchwork Fri Nov 30 08:08:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705925 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B54214BD for ; Fri, 30 Nov 2018 08:09:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5CC852EFF3 for ; Fri, 30 Nov 2018 08:09:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 512502EFFC; Fri, 30 Nov 2018 08:09:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C36722EFF3 for ; Fri, 30 Nov 2018 08:09:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727281AbeK3TRZ (ORCPT ); Fri, 30 Nov 2018 14:17:25 -0500 Received: from mga18.intel.com ([134.134.136.126]:64867 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726992AbeK3TRZ (ORCPT ); Fri, 30 Nov 2018 14:17:25 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:08:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="117411994" Received: from linux.intel.com ([10.54.29.200]) by fmsmga004.fm.intel.com with ESMTP; 30 Nov 2018 00:08:56 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 51BA4580213; Fri, 30 Nov 2018 00:08:55 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 06/11] KVM: VMX: Introduce SPP-Induced vm exit and it's handle. Date: Fri, 30 Nov 2018 16:08:40 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Accesses using guest-physical addresses may cause SPP-induced VM exits due to an SPPT misconfiguration or an SPPT miss. The basic VM exit reason code reported for SPP-induced VM exits is 66. An SPPT misconfiguration VM exit occurs when, in the course of translating a guest-physical address, the logical processor encounters a leaf EPT paging-structure entry mapping a 4KB page for which the sub-page write permission control bit is set and during the SPPT lookup an SPPT paging-structure entry contains an unsupported value. An SPPT miss VM exit occurs when, in the course of translation a guest-physical address, the logical processor encounters a leaf EPT paging-structure entry for which the sub-page write permission control bit is set and during the SPPT lookup there is no SPPT misconfiguration but any level of SPPT paging-structure entries are not-present. SPPT misconfigurations and SPPT misses can occur only due to an attempt to write memory with a guest-physical address. Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/include/asm/vmx.h | 7 +++++++ arch/x86/include/uapi/asm/vmx.h | 2 ++ arch/x86/kvm/vmx.c | 45 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index bd4ec8a..ee24eb2 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -539,6 +539,13 @@ struct vmx_msr_entry { #define EPT_VIOLATION_GVA_TRANSLATED (1 << EPT_VIOLATION_GVA_TRANSLATED_BIT) /* + * Exit Qualifications for SPPT-Induced VM Exits + */ +#define SPPT_INDUCED_EXIT_TYPE_BIT 11 +#define SPPT_INDUCED_EXIT_TYPE (1 << SPPT_INDUCED_EXIT_TYPE_BIT) +#define SPPT_INTR_INFO_UNBLOCK_NMI INTR_INFO_UNBLOCK_NMI + +/* * VM-instruction error numbers */ enum vm_instruction_error_number { diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index f0b0c90..ac67622 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -85,6 +85,7 @@ #define EXIT_REASON_PML_FULL 62 #define EXIT_REASON_XSAVES 63 #define EXIT_REASON_XRSTORS 64 +#define EXIT_REASON_SPP 66 #define VMX_EXIT_REASONS \ { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \ @@ -141,6 +142,7 @@ { EXIT_REASON_ENCLS, "ENCLS" }, \ { EXIT_REASON_RDSEED, "RDSEED" }, \ { EXIT_REASON_PML_FULL, "PML_FULL" }, \ + { EXIT_REASON_SPP, "SPP" }, \ { EXIT_REASON_XSAVES, "XSAVES" }, \ { EXIT_REASON_XRSTORS, "XRSTORS" } diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e96b4c7..6634098 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9698,6 +9698,50 @@ static int handle_invpcid(struct kvm_vcpu *vcpu) } } +static int handle_spp(struct kvm_vcpu *vcpu) +{ + unsigned long exit_qualification; + + exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + /* + * SPP VM exit happened while executing iret from NMI, + * "blocked by NMI" bit has to be set before next VM entry. + * There are errata that may cause this bit to not be set: + * AAK134, BY25. + */ + if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) && + (exit_qualification & SPPT_INTR_INFO_UNBLOCK_NMI)) + vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, + GUEST_INTR_STATE_NMI); + + pr_debug("SPP: SPP exit_qualification=%lx\n", exit_qualification); + + vcpu->arch.exit_qualification = exit_qualification; + + if (exit_qualification & SPPT_INDUCED_EXIT_TYPE) { + /* + * SPPT Miss + * We don't set SPP write access for the corresponding + * GPA, if we haven't setup, we need to construct + * SPP table here. + */ + pr_debug("SPP: %s: SPPT Miss!!!\n", __func__); + return 1; + } + + /* + * SPPT Misconfig + * This is probably possible that your sppt table + * set as a incorrect format + */ + WARN_ON(1); + vcpu->run->exit_reason = KVM_EXIT_UNKNOWN; + vcpu->run->hw.hardware_exit_reason = EXIT_REASON_SPP; + pr_alert("SPP: %s: SPPT Misconfiguration!!!\n", __func__); + return 0; +} + static int handle_pml_full(struct kvm_vcpu *vcpu) { unsigned long exit_qualification; @@ -9910,6 +9954,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_INVVPID] = handle_invvpid, [EXIT_REASON_RDRAND] = handle_invalid_op, [EXIT_REASON_RDSEED] = handle_invalid_op, + [EXIT_REASON_SPP] = handle_spp, [EXIT_REASON_XSAVES] = handle_xsaves, [EXIT_REASON_XRSTORS] = handle_xrstors, [EXIT_REASON_PML_FULL] = handle_pml_full, From patchwork Fri Nov 30 08:08:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 442711057 for ; Fri, 30 Nov 2018 08:09:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3630E2F88A for ; Fri, 30 Nov 2018 08:09:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28C7D2F893; Fri, 30 Nov 2018 08:09:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 156B22F88A for ; Fri, 30 Nov 2018 08:09:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727115AbeK3TRe (ORCPT ); Fri, 30 Nov 2018 14:17:34 -0500 Received: from mga17.intel.com ([192.55.52.151]:5240 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726551AbeK3TRe (ORCPT ); Fri, 30 Nov 2018 14:17:34 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:09:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="96176535" Received: from linux.intel.com ([10.54.29.200]) by orsmga006.jf.intel.com with ESMTP; 30 Nov 2018 00:09:06 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 0CA42580460; Fri, 30 Nov 2018 00:09:03 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 07/11] KVM: VMX: Added handle of SPP write protection fault. Date: Fri, 30 Nov 2018 16:08:48 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 In-Reply-To: References: MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A control bit in EPT leaf paging-structure entries is defined as “Sub-Page Permission” (SPP bit). The bit position is 61 While hardware walking the SPP page table, If the sub-page region write permission bit is set, the write is allowed, else the write is disallowed and results in an EPT violation. we need peek this case in EPT violation handler, and trigger a user-space exit, return the write protected address(GPA) to user(qemu). Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/kvm/mmu.c | 19 +++++++++++++++++++ arch/x86/kvm/mmu.h | 1 + include/uapi/linux/kvm.h | 5 +++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d1f1fe1..d077693 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3378,6 +3378,21 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level, if ((error_code & PFERR_WRITE_MASK) && spte_can_locklessly_be_made_writable(spte)) { + /* + * Record write protect fault caused by + * Sub-page Protection + */ + if (spte & PT_SPP_MASK) { + fault_handled = true; + + vcpu->run->exit_reason = KVM_EXIT_SPP; + vcpu->run->spp.addr = gva; + kvm_skip_emulated_instruction(vcpu); + + /* Let QEMU decide how to handle this. */ + break; + } + new_spte |= PT_WRITABLE_MASK; /* @@ -5343,6 +5358,10 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code, r = vcpu->arch.mmu->page_fault(vcpu, cr2, lower_32_bits(error_code), false); + + if (vcpu->run->exit_reason == KVM_EXIT_SPP) + return 0; + WARN_ON(r == RET_PF_INVALID); } diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index c7b3331..b41e9e9 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -26,6 +26,7 @@ #define PT_PAGE_SIZE_MASK (1ULL << PT_PAGE_SIZE_SHIFT) #define PT_PAT_MASK (1ULL << 7) #define PT_GLOBAL_MASK (1ULL << 8) +#define PT_SPP_MASK (1ULL << 61) #define PT64_NX_SHIFT 63 #define PT64_NX_MASK (1ULL << PT64_NX_SHIFT) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2b7a652..01174f8 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -235,6 +235,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_S390_STSI 25 #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 +#define KVM_EXIT_SPP 28 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -390,6 +391,10 @@ struct kvm_run { struct { __u8 vector; } eoi; + /* KVM_EXIT_SPP */ + struct { + __u64 addr; + } spp; /* KVM_EXIT_HYPERV */ struct kvm_hyperv_exit hyperv; /* Fix the size of the union. */ From patchwork Fri Nov 30 08:08:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EA2218B8 for ; Fri, 30 Nov 2018 08:09:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 412A62F891 for ; Fri, 30 Nov 2018 08:09:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3584D2F895; Fri, 30 Nov 2018 08:09:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3BA3D2F891 for ; Fri, 30 Nov 2018 08:09:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727301AbeK3TRt (ORCPT ); Fri, 30 Nov 2018 14:17:49 -0500 Received: from mga11.intel.com ([192.55.52.93]:64152 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727109AbeK3TRt (ORCPT ); Fri, 30 Nov 2018 14:17:49 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:09:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="117412120" Received: from linux.intel.com ([10.54.29.200]) by fmsmga004.fm.intel.com with ESMTP; 30 Nov 2018 00:09:16 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id D7772580213; Fri, 30 Nov 2018 00:09:13 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 08/11] KVM: VMX: Introduce ioctls to set/get Sub-Page Write Protection. Date: Fri, 30 Nov 2018 16:08:58 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We introduced 2 ioctls to let user application to set/get subpage write protection bitmap per gfn, each gfn corresponds to a bitmap. The user application, qemu, or some other security control daemon. will set the protection bitmap via this ioctl. the API defined as: struct kvm_subpage { __u64 base_gfn; __u64 npages; /* sub-page write-access bitmap array */ __u32 access_map[SUBPAGE_MAX_BITMAP]; }sp; kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp) kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp) Signed-off-by: Zhang Yi Signed-off-by: He Chen --- arch/x86/include/asm/kvm_host.h | 9 +++ arch/x86/kvm/mmu.c | 49 ++++++++++++++++ arch/x86/kvm/vmx.c | 20 +++++++ arch/x86/kvm/x86.c | 124 +++++++++++++++++++++++++++++++++++++++- include/linux/kvm_host.h | 5 ++ include/uapi/linux/kvm.h | 11 ++++ 6 files changed, 217 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 46312b9..3218d91 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -397,6 +397,8 @@ struct kvm_mmu { void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa); void (*update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 *spte, const void *pte); + int (*get_subpages)(struct kvm *kvm, struct kvm_subpage *spp_info); + int (*set_subpages)(struct kvm *kvm, struct kvm_subpage *spp_info); hpa_t root_hpa; hpa_t sppt_root; union kvm_mmu_role mmu_role; @@ -784,6 +786,7 @@ struct kvm_lpage_info { struct kvm_arch_memory_slot { struct kvm_rmap_head *rmap[KVM_NR_PAGE_SIZES]; + u32 *subpage_wp_info; struct kvm_lpage_info *lpage_info[KVM_NR_PAGE_SIZES - 1]; unsigned short *gfn_track[KVM_PAGE_TRACK_MAX]; }; @@ -1187,6 +1190,9 @@ struct kvm_x86_ops { int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu, uint16_t *vmcs_version); + + int (*get_subpages)(struct kvm *kvm, struct kvm_subpage *spp_info); + int (*set_subpages)(struct kvm *kvm, struct kvm_subpage *spp_info); }; struct kvm_arch_async_pf { @@ -1400,6 +1406,9 @@ void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); void kvm_mmu_new_cr3(struct kvm_vcpu *vcpu, gpa_t new_cr3, bool skip_tlb_flush); +int kvm_mmu_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); +int kvm_mmu_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); + void kvm_enable_tdp(void); void kvm_disable_tdp(void); diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d077693..b1773c6 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1430,6 +1430,15 @@ static u64 *rmap_get_next(struct rmap_iterator *iter) return sptep; } +static u32 *gfn_to_subpage_wp_info(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + unsigned long idx; + + idx = gfn_to_index(gfn, slot->base_gfn, PT_PAGE_TABLE_LEVEL); + return &slot->arch.subpage_wp_info[idx]; +} + #define for_each_rmap_spte(_rmap_head_, _iter_, _spte_) \ for (_spte_ = rmap_get_first(_rmap_head_, _iter_); \ _spte_; _spte_ = rmap_get_next(_iter_)) @@ -4141,6 +4150,44 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code, return RET_PF_RETRY; } +int kvm_mmu_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) +{ + u32 *access = spp_info->access_map; + gfn_t gfn = spp_info->base_gfn; + int npages = spp_info->npages; + struct kvm_memory_slot *slot; + int i; + + for (i = 0; i < npages; i++, gfn++) { + slot = gfn_to_memslot(kvm, gfn); + if (!slot) + return -EFAULT; + access[i] = *gfn_to_subpage_wp_info(slot, gfn); + } + + return i; +} + +int kvm_mmu_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) +{ + u32 access = spp_info->access_map[0]; + gfn_t gfn = spp_info->base_gfn; + int npages = spp_info->npages; + struct kvm_memory_slot *slot; + u32 *wp_map; + int i; + + for (i = 0; i < npages; i++, gfn++) { + slot = gfn_to_memslot(kvm, gfn); + if (!slot) + return -EFAULT; + wp_map = gfn_to_subpage_wp_info(slot, gfn); + *wp_map = access; + } + + return i; +} + static void nonpaging_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { @@ -4835,6 +4882,8 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) context->get_cr3 = get_cr3; context->get_pdptr = kvm_pdptr_read; context->inject_page_fault = kvm_inject_page_fault; + context->get_subpages = kvm_x86_ops->get_subpages; + context->set_subpages = kvm_x86_ops->set_subpages; if (!is_paging(vcpu)) { context->nx = false; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6634098..b660812 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8028,6 +8028,11 @@ static __init int hardware_setup(void) kvm_x86_ops->enable_log_dirty_pt_masked = NULL; } + if (!enable_ept_spp) { + kvm_x86_ops->get_subpages = NULL; + kvm_x86_ops->set_subpages = NULL; + } + if (!cpu_has_vmx_preemption_timer()) kvm_x86_ops->request_immediate_exit = __kvm_request_immediate_exit; @@ -15037,6 +15042,18 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, return 0; } +static int vmx_get_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + return kvm_get_subpages(kvm, spp_info); +} + +static int vmx_set_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + return kvm_set_subpages(kvm, spp_info); +} + static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -15184,6 +15201,9 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .enable_smi_window = enable_smi_window, .nested_enable_evmcs = nested_enable_evmcs, + + .get_subpages = vmx_get_subpages, + .set_subpages = vmx_set_subpages, }; static void vmx_cleanup_l1d_flush(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5cd5647..fa36858 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4507,6 +4507,44 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, return r; } +static int kvm_vm_ioctl_get_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + return kvm_arch_get_subpages(kvm, spp_info); +} + +static int kvm_vm_ioctl_set_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + return kvm_arch_set_subpages(kvm, spp_info); +} + +int kvm_get_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + int ret; + + mutex_lock(&kvm->slots_lock); + ret = kvm_mmu_get_subpages(kvm, spp_info); + mutex_unlock(&kvm->slots_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(kvm_get_subpages); + +int kvm_set_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + int ret; + + mutex_lock(&kvm->slots_lock); + ret = kvm_mmu_set_subpages(kvm, spp_info); + mutex_unlock(&kvm->slots_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(kvm_set_subpages); + long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -4811,6 +4849,39 @@ long kvm_arch_vm_ioctl(struct file *filp, if (copy_from_user(&hvevfd, argp, sizeof(hvevfd))) goto out; r = kvm_vm_ioctl_hv_eventfd(kvm, &hvevfd); + } + case KVM_SUBPAGES_GET_ACCESS: { + struct kvm_subpage spp_info; + + r = -EFAULT; + if (copy_from_user(&spp_info, argp, sizeof(spp_info))) + goto out; + + r = -EINVAL; + if (spp_info.npages == 0 || + spp_info.npages > SUBPAGE_MAX_BITMAP) + goto out; + + r = kvm_vm_ioctl_get_subpages(kvm, &spp_info); + if (copy_to_user(argp, &spp_info, sizeof(spp_info))) { + r = -EFAULT; + goto out; + } + break; + } + case KVM_SUBPAGES_SET_ACCESS: { + struct kvm_subpage spp_info; + + r = -EFAULT; + if (copy_from_user(&spp_info, argp, sizeof(spp_info))) + goto out; + + r = -EINVAL; + if (spp_info.npages == 0 || + spp_info.npages > SUBPAGE_MAX_BITMAP) + goto out; + + r = kvm_vm_ioctl_set_subpages(kvm, &spp_info); break; } default: @@ -9152,6 +9223,34 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_hv_destroy_vm(kvm); } +int kvm_subpage_create_memslot(struct kvm_memory_slot *slot, + unsigned long npages) +{ + int lpages; + + lpages = gfn_to_index(slot->base_gfn + npages - 1, + slot->base_gfn, 1) + 1; + + slot->arch.subpage_wp_info = + kvzalloc(lpages * sizeof(*slot->arch.subpage_wp_info), + GFP_KERNEL); + + if (!slot->arch.subpage_wp_info) + return -ENOMEM; + + return 0; +} + +void kvm_subpage_free_memslot(struct kvm_memory_slot *free, + struct kvm_memory_slot *dont) +{ + if (!dont || free->arch.subpage_wp_info != + dont->arch.subpage_wp_info) { + kvfree(free->arch.subpage_wp_info); + free->arch.subpage_wp_info = NULL; + } +} + void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, struct kvm_memory_slot *dont) { @@ -9173,6 +9272,7 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, } kvm_page_track_free_memslot(free, dont); + kvm_subpage_free_memslot(free, dont); } int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, @@ -9225,8 +9325,12 @@ int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, if (kvm_page_track_create_memslot(slot, npages)) goto out_free; - return 0; + if (kvm_subpage_create_memslot(slot, npages)) + goto out_free_page_track; + return 0; +out_free_page_track: + kvm_page_track_free_memslot(slot, NULL); out_free: for (i = 0; i < KVM_NR_PAGE_SIZES; ++i) { kvfree(slot->arch.rmap[i]); @@ -9713,6 +9817,24 @@ int kvm_arch_update_irqfd_routing(struct kvm *kvm, unsigned int host_irq, return kvm_x86_ops->update_pi_irte(kvm, host_irq, guest_irq, set); } +int kvm_arch_get_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + if (!kvm_x86_ops->get_subpages) + return -EINVAL; + + return kvm_x86_ops->get_subpages(kvm, spp_info); +} + +int kvm_arch_set_subpages(struct kvm *kvm, + struct kvm_subpage *spp_info) +{ + if (!kvm_x86_ops->set_subpages) + return -EINVAL; + + return kvm_x86_ops->set_subpages(kvm, spp_info); +} + bool kvm_vector_hashing_enabled(void) { return vector_hashing; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index c926698..7f29f97 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -816,6 +816,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu); bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu); int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu); +int kvm_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); +int kvm_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); +int kvm_arch_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); +int kvm_arch_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info); + #ifndef __KVM_HAVE_ARCH_VM_ALLOC /* * All architectures that want to use vzalloc currently also diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 01174f8..3fd6d14 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -102,6 +102,15 @@ struct kvm_userspace_memory_region { __u64 userspace_addr; /* start of the userspace allocated memory */ }; +/* for KVM_SUBPAGES_GET_ACCESS and KVM_SUBPAGES_SET_ACCESS */ +#define SUBPAGE_MAX_BITMAP 128 +struct kvm_subpage { + __u64 base_gfn; + __u64 npages; + /* sub-page write-access bitmap array */ + __u32 access_map[SUBPAGE_MAX_BITMAP]; +}; + /* * The bit 0 ~ bit 15 of kvm_memory_region::flags are visible for userspace, * other bits are reserved for kvm internal use which are defined in @@ -1229,6 +1238,8 @@ struct kvm_vfio_spapr_tce { struct kvm_userspace_memory_region) #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) +#define KVM_SUBPAGES_GET_ACCESS _IOR(KVMIO, 0x49, __u64) +#define KVM_SUBPAGES_SET_ACCESS _IOW(KVMIO, 0x4a, __u64) /* enable ucontrol for s390 */ struct kvm_s390_ucas_mapping { From patchwork Fri Nov 30 08:09:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705929 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F4A214BD for ; Fri, 30 Nov 2018 08:09:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 422BF2F891 for ; Fri, 30 Nov 2018 08:09:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 367A52F895; Fri, 30 Nov 2018 08:09:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C1A012F891 for ; Fri, 30 Nov 2018 08:09:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727368AbeK3TRx (ORCPT ); Fri, 30 Nov 2018 14:17:53 -0500 Received: from mga01.intel.com ([192.55.52.88]:17444 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727347AbeK3TRw (ORCPT ); Fri, 30 Nov 2018 14:17:52 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:09:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="105480402" Received: from linux.intel.com ([10.54.29.200]) by orsmga003.jf.intel.com with ESMTP; 30 Nov 2018 00:09:24 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id B8D36580213; Fri, 30 Nov 2018 00:09:22 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 09/11] KVM: VMX: Update the EPT leaf entry indicated with the SPP enable bit. Date: Fri, 30 Nov 2018 16:09:07 +0800 Message-Id: <09004f3e9fbe0a74223bf23f1bbe980398fa854b.1543481993.git.yi.z.zhang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the sub-page write permission VM-execution control is set, treatment of write accesses to guest-physical accesses depends on the state of the accumulated write-access bit (position 1) and sub-page permission bit (position 61) in the EPT leaf paging-structure. Software will update the EPT leaf entry sub-page permission bit while kvm_set_subpage. If the EPT write-access bit set to 0 and the SPP bit set to 1 in the leaf EPT paging-structure entry that maps a 4KB page, then the hardware will look up a VMM-managed Sub-Page Permission Table (SPPT), which will also be prepared by setup kvm_set_subpage. Signed-off-by: Zhang Yi --- arch/x86/kvm/mmu.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index b1773c6..d512125 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1668,6 +1668,87 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu) return 0; } +static bool __rmap_open_subpage_bit(struct kvm *kvm, + struct kvm_rmap_head *rmap_head) +{ + struct rmap_iterator iter; + bool flush = false; + u64 *sptep; + u64 spte; + + for_each_rmap_spte(rmap_head, &iter, sptep) { + /* + * SPP works only when the page is unwritable + * and SPP bit is set + */ + flush |= spte_write_protect(sptep, false); + spte = *sptep | PT_SPP_MASK; + flush |= mmu_spte_update(sptep, spte); + } + + return flush; +} + +static int kvm_mmu_open_subpage_write_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn) +{ + struct kvm_rmap_head *rmap_head; + bool flush = false; + + /* + * we only support spp in a normal 4K level 1 page frame + * If it a huge page, we drop it. + */ + rmap_head = __gfn_to_rmap(gfn, PT_PAGE_TABLE_LEVEL, slot); + + if (!rmap_head->val) + return -EFAULT; + + flush |= __rmap_open_subpage_bit(kvm, rmap_head); + + if (flush) + kvm_flush_remote_tlbs(kvm); + + return 0; +} + +static bool __rmap_clear_subpage_bit(struct kvm *kvm, + struct kvm_rmap_head *rmap_head) +{ + struct rmap_iterator iter; + bool flush = false; + u64 *sptep; + u64 spte; + + for_each_rmap_spte(rmap_head, &iter, sptep) { + spte = (*sptep & ~PT_SPP_MASK) | PT_WRITABLE_MASK; + flush |= mmu_spte_update(sptep, spte); + } + + return flush; +} + +static int kvm_mmu_clear_subpage_write_protect(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn) +{ + struct kvm_rmap_head *rmap_head; + bool flush = false; + + rmap_head = __gfn_to_rmap(gfn, PT_PAGE_TABLE_LEVEL, slot); + + if (!rmap_head->val) + return -EFAULT; + + flush |= __rmap_clear_subpage_bit(kvm, rmap_head); + + if (flush) + kvm_flush_remote_tlbs(kvm); + + return 0; +} + bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, struct kvm_memory_slot *slot, u64 gfn) { @@ -4175,12 +4256,31 @@ int kvm_mmu_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) int npages = spp_info->npages; struct kvm_memory_slot *slot; u32 *wp_map; + int ret; int i; for (i = 0; i < npages; i++, gfn++) { slot = gfn_to_memslot(kvm, gfn); if (!slot) return -EFAULT; + + /* + * open SPP bit in EPT leaf entry to write protect the + * sub-pages in corresponding page + */ + if (access != (u32)((1ULL << 32) - 1)) + ret = kvm_mmu_open_subpage_write_protect(kvm, + slot, gfn); + else + ret = kvm_mmu_clear_subpage_write_protect(kvm, + slot, gfn); + + if (ret) { + pr_info("SPP ,didn't get the gfn:%llx from EPT leaf level1\n" + "Current we didn't support huge page on SPP\n" + "Please try to disable the huge page\n", gfn); + return -EFAULT; + } wp_map = gfn_to_subpage_wp_info(slot, gfn); *wp_map = access; } From patchwork Fri Nov 30 08:09:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705931 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93A9E18B8 for ; Fri, 30 Nov 2018 08:09:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 866802F891 for ; Fri, 30 Nov 2018 08:09:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 77E4E2F895; Fri, 30 Nov 2018 08:09:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2F5C2F891 for ; Fri, 30 Nov 2018 08:09:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727233AbeK3TSG (ORCPT ); Fri, 30 Nov 2018 14:18:06 -0500 Received: from mga17.intel.com ([192.55.52.151]:5287 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726551AbeK3TSG (ORCPT ); Fri, 30 Nov 2018 14:18:06 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:09:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="105480429" Received: from linux.intel.com ([10.54.29.200]) by orsmga003.jf.intel.com with ESMTP; 30 Nov 2018 00:09:37 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 882B8580460; Fri, 30 Nov 2018 00:09:35 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 10/11] KVM: VMX: Added setup spp page structure. Date: Fri, 30 Nov 2018 16:09:20 +0800 Message-Id: <582cad9754a9651595385e901430ab69bbd668b6.1543481993.git.yi.z.zhang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The hardware uses the guest-physical address and bits 11:7 of the address accessed to lookup the SPPT to fetch a write permission bit for the 128 byte wide sub-page region being accessed within the 4K guest-physical page. If the sub-page region write permission bit is set, the write is allowed; otherwise the write is disallowed and results in an EPT violation. Guest-physical pages mapped via leaf EPT-paging-structures for which the accumulated write-access bit and the SPP bits are both clear (0) generate EPT violations on memory writes accesses. Guest-physical pages mapped via EPT-paging-structure for which the accumulated write-access bit is set (1) allow writes, effectively ignoring the SPP bit on the leaf EPT-paging structure. Software will setup the spp page table level4,3,2 as well as EPT page structure, and fill the level1 via the 32 bit bitmap per a single 4K page. Now it could be divided to 32 x 128 sub-pages. Signed-off-by: Zhang Yi --- arch/x86/include/asm/kvm_host.h | 4 ++ arch/x86/kvm/mmu.c | 123 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 125 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3218d91..ce6d258 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1402,6 +1402,10 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u64 error_code, void *insn, int insn_len); + +int kvm_mmu_setup_spp_structure(struct kvm_vcpu *vcpu, + u32 access_map, gfn_t gfn); + void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); void kvm_mmu_new_cr3(struct kvm_vcpu *vcpu, gpa_t new_cr3, bool skip_tlb_flush); diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index d512125..287ee62 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -206,6 +206,11 @@ static const union kvm_mmu_page_role mmu_base_role_mask = { ({ spte = mmu_spte_get_lockless(_walker.sptep); 1; }); \ __shadow_walk_next(&(_walker), spte)) +#define for_each_shadow_spp_entry(_vcpu, _addr, _walker) \ + for (shadow_spp_walk_init(&(_walker), _vcpu, _addr); \ + shadow_walk_okay(&(_walker)); \ + shadow_walk_next(&(_walker))) + static struct kmem_cache *pte_list_desc_cache; static struct kmem_cache *mmu_page_header_cache; static struct percpu_counter kvm_total_used_mmu_pages; @@ -476,6 +481,11 @@ static int is_shadow_present_pte(u64 pte) return (pte != 0) && !is_mmio_spte(pte); } +static int is_spp_mide_page_present(u64 pte) +{ + return pte & PT_PRESENT_MASK; +} + static int is_large_pte(u64 pte) { return pte & PT_PAGE_SIZE_MASK; @@ -495,6 +505,11 @@ static bool is_executable_pte(u64 spte) return (spte & (shadow_x_mask | shadow_nx_mask)) == shadow_x_mask; } +static bool is_spp_spte(struct kvm_mmu_page *sp) +{ + return sp->role.spp; +} + static kvm_pfn_t spte_to_pfn(u64 pte) { return (pte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT; @@ -2606,6 +2621,16 @@ static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator, addr); } +static void shadow_spp_walk_init(struct kvm_shadow_walk_iterator *iterator, + struct kvm_vcpu *vcpu, u64 addr) +{ + iterator->addr = addr; + iterator->shadow_addr = vcpu->arch.mmu->sppt_root; + + /* SPP Table is a 4-level paging structure */ + iterator->level = 4; +} + static bool shadow_walk_okay(struct kvm_shadow_walk_iterator *iterator) { if (iterator->level < PT_PAGE_TABLE_LEVEL) @@ -2656,6 +2681,18 @@ static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, mark_unsync(sptep); } +static void link_spp_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, + struct kvm_mmu_page *sp) +{ + u64 spte; + + spte = __pa(sp->spt) | PT_PRESENT_MASK; + + mmu_spte_set(sptep, spte); + + mmu_page_add_parent_pte(vcpu, sp, sptep); +} + static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned direct_access) { @@ -2686,7 +2723,13 @@ static bool mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, pte = *spte; if (is_shadow_present_pte(pte)) { - if (is_last_spte(pte, sp->role.level)) { + if (is_spp_spte(sp)) { + if (sp->role.level == PT_PAGE_TABLE_LEVEL) + //spp page do not need to release rmap. + return true; + child = page_header(pte & PT64_BASE_ADDR_MASK); + drop_parent_pte(child, spte); + } else if (is_last_spte(pte, sp->role.level)) { drop_spte(kvm, spte); if (is_large_pte(pte)) --kvm->stat.lpages; @@ -4231,6 +4274,77 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code, return RET_PF_RETRY; } +static u64 format_spp_spte(u32 spp_wp_bitmap) +{ + u64 new_spte = 0; + int i = 0; + + /* + * One 4K page contains 32 sub-pages, in SPP table L4E, old bits + * are reserved, so we need to transfer u32 subpage write + * protect bitmap to u64 SPP L4E format. + */ + while (i < 32) { + if (spp_wp_bitmap & (1ULL << i)) + new_spte |= 1ULL << (i * 2); + + i++; + } + + return new_spte; +} + +static void mmu_spp_spte_set(u64 *sptep, u64 new_spte) +{ + __set_spte(sptep, new_spte); +} + +int kvm_mmu_setup_spp_structure(struct kvm_vcpu *vcpu, + u32 access_map, gfn_t gfn) +{ + struct kvm_shadow_walk_iterator iter; + struct kvm_mmu_page *sp; + gfn_t pseudo_gfn; + u64 old_spte, spp_spte; + struct kvm *kvm = vcpu->kvm; + + spin_lock(&kvm->mmu_lock); + + /* direct_map spp start */ + + if (!VALID_PAGE(vcpu->arch.mmu->sppt_root)) + goto out_unlock; + + for_each_shadow_spp_entry(vcpu, (u64)gfn << PAGE_SHIFT, iter) { + if (iter.level == PT_PAGE_TABLE_LEVEL) { + spp_spte = format_spp_spte(access_map); + old_spte = mmu_spte_get_lockless(iter.sptep); + if (old_spte != spp_spte) { + mmu_spp_spte_set(iter.sptep, spp_spte); + kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu); + } + break; + } + + if (!is_spp_mide_page_present(*iter.sptep)) { + u64 base_addr = iter.addr; + + base_addr &= PT64_LVL_ADDR_MASK(iter.level); + pseudo_gfn = base_addr >> PAGE_SHIFT; + sp = kvm_mmu_get_spp_page(vcpu, pseudo_gfn, + iter.level - 1); + link_spp_shadow_page(vcpu, iter.sptep, sp); + } + } + + spin_unlock(&kvm->mmu_lock); + return 0; + +out_unlock: + spin_unlock(&kvm->mmu_lock); + return -EFAULT; +} + int kvm_mmu_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) { u32 *access = spp_info->access_map; @@ -4255,9 +4369,10 @@ int kvm_mmu_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) gfn_t gfn = spp_info->base_gfn; int npages = spp_info->npages; struct kvm_memory_slot *slot; + struct kvm_vcpu *vcpu; u32 *wp_map; int ret; - int i; + int i, j; for (i = 0; i < npages; i++, gfn++) { slot = gfn_to_memslot(kvm, gfn); @@ -4281,6 +4396,10 @@ int kvm_mmu_set_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) "Please try to disable the huge page\n", gfn); return -EFAULT; } + + kvm_for_each_vcpu(j, vcpu, kvm) + kvm_mmu_setup_spp_structure(vcpu, access, gfn); + wp_map = gfn_to_subpage_wp_info(slot, gfn); *wp_map = access; } From patchwork Fri Nov 30 08:09:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Yi" X-Patchwork-Id: 10705935 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 513FA13AD for ; Fri, 30 Nov 2018 08:09:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 459632F891 for ; Fri, 30 Nov 2018 08:09:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 39D892F895; Fri, 30 Nov 2018 08:09:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B59942F891 for ; Fri, 30 Nov 2018 08:09:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727460AbeK3TSO (ORCPT ); Fri, 30 Nov 2018 14:18:14 -0500 Received: from mga07.intel.com ([134.134.136.100]:29561 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727444AbeK3TSN (ORCPT ); Fri, 30 Nov 2018 14:18:13 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Nov 2018 00:09:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208";a="94509030" Received: from linux.intel.com ([10.54.29.200]) by orsmga007.jf.intel.com with ESMTP; 30 Nov 2018 00:09:45 -0800 Received: from dazhang1-ssd.sh.intel.com (unknown [10.239.48.128]) by linux.intel.com (Postfix) with ESMTP id 3FFB5580213; Fri, 30 Nov 2018 00:09:43 -0800 (PST) From: Zhang Yi To: pbonzini@redhat.com, mdontu@bitdefender.com, ncitu@bitdefender.com Cc: rkrcmar@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Zhang Yi Subject: [RFC PATCH V2 11/11] KVM: VMX: implement setup SPP page structure in spp miss. Date: Fri, 30 Nov 2018 16:09:28 +0800 Message-Id: X-Mailer: git-send-email 2.7.4 In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We also should setup SPP page structure while we catch a SPP miss, some case, such as hotplug vcpu, should update the SPP page table in SPP miss handler. Signed-off-by: Zhang Yi --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu.c | 12 ++++++++++++ arch/x86/kvm/vmx.c | 8 ++++++++ 3 files changed, 22 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ce6d258..a09ea39 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1406,6 +1406,8 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u64 error_code, int kvm_mmu_setup_spp_structure(struct kvm_vcpu *vcpu, u32 access_map, gfn_t gfn); +int kvm_mmu_get_spp_acsess_map(struct kvm *kvm, u32 *access_map, gfn_t gfn); + void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid); void kvm_mmu_new_cr3(struct kvm_vcpu *vcpu, gpa_t new_cr3, bool skip_tlb_flush); diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 287ee62..01cf85e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4299,6 +4299,17 @@ static void mmu_spp_spte_set(u64 *sptep, u64 new_spte) __set_spte(sptep, new_spte); } +int kvm_mmu_get_spp_acsess_map(struct kvm *kvm, u32 *access_map, gfn_t gfn) +{ + struct kvm_memory_slot *slot; + + slot = gfn_to_memslot(kvm, gfn); + *access_map = *gfn_to_subpage_wp_info(slot, gfn); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_mmu_get_spp_acsess_map); + int kvm_mmu_setup_spp_structure(struct kvm_vcpu *vcpu, u32 access_map, gfn_t gfn) { @@ -4344,6 +4355,7 @@ int kvm_mmu_setup_spp_structure(struct kvm_vcpu *vcpu, spin_unlock(&kvm->mmu_lock); return -EFAULT; } +EXPORT_SYMBOL_GPL(kvm_mmu_setup_spp_structure); int kvm_mmu_get_subpages(struct kvm *kvm, struct kvm_subpage *spp_info) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b660812..b0ab645 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9706,6 +9706,9 @@ static int handle_invpcid(struct kvm_vcpu *vcpu) static int handle_spp(struct kvm_vcpu *vcpu) { unsigned long exit_qualification; + gpa_t gpa; + gfn_t gfn; + u32 map; exit_qualification = vmcs_readl(EXIT_QUALIFICATION); @@ -9732,6 +9735,11 @@ static int handle_spp(struct kvm_vcpu *vcpu) * SPP table here. */ pr_debug("SPP: %s: SPPT Miss!!!\n", __func__); + + gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); + gfn = gpa >> PAGE_SHIFT; + kvm_mmu_get_spp_acsess_map(vcpu->kvm, &map, gfn); + kvm_mmu_setup_spp_structure(vcpu, map, gfn); return 1; }