diff mbox

[1/5] KVM: vmx: fix ept reserved bits for 1-GByte page

Message ID 1408355431-115633-1-git-send-email-wanpeng.li@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li Aug. 18, 2014, 9:50 a.m. UTC
EPT misconfig handler in kvm will check which reason lead to EPT 
misconfiguration after vmexit. One of the reasons is that an EPT 
paging-structure entry is configured with settings reserved for 
future functionality. However, the handler can't identify if 
paging-structure entry of reserved bits for 1-GByte page are 
configured, since PDPTE which point to 1-GByte page will reserve 
bits 29:12 instead of bits 7:3 which are reserved for PDPTE that 
references an EPT Page Directory. This patch fix it by reserve 
bits 29:12 for 1-GByte page. 

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
 arch/x86/kvm/vmx.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Paolo Bonzini Aug. 18, 2014, 10:18 a.m. UTC | #1
Il 18/08/2014 11:50, Wanpeng Li ha scritto:
> EPT misconfig handler in kvm will check which reason lead to EPT 
> misconfiguration after vmexit. One of the reasons is that an EPT 
> paging-structure entry is configured with settings reserved for 
> future functionality. However, the handler can't identify if 
> paging-structure entry of reserved bits for 1-GByte page are 
> configured, since PDPTE which point to 1-GByte page will reserve 
> bits 29:12 instead of bits 7:3 which are reserved for PDPTE that 
> references an EPT Page Directory. This patch fix it by reserve 
> bits 29:12 for 1-GByte page. 
> 
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
>  arch/x86/kvm/vmx.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index bfe11cf..71cbee5 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5521,9 +5521,14 @@ static u64 ept_rsvd_mask(u64 spte, int level)
>  	for (i = 51; i > boot_cpu_data.x86_phys_bits; i--)
>  		mask |= (1ULL << i);
>  
> -	if (level > 2)
> -		/* bits 7:3 reserved */
> -		mask |= 0xf8;
> +	if (level > 2) {

level can be 4 here.  You have to return 0xf8 for level == 4.

The same "if" statement then can cover both 2MB and 1GB pages, like

                if (spte & (1ULL << 7))
                        /* 1GB/2MB page, bits 29:12 or 20:12 reserved respectively */
                        mask |= (PAGE_SIZE << ((level - 1) * 9)) - PAGE_SIZE;
                else
                        /* bits 6:3 reserved */
                        mask |= 0x78;

> -		if (level == 1 || (level == 2 && (spte & (1ULL << 7)))) {
> +		if (level == 1 || ((level == 3 || level == 2)
> +				&& (spte & (1ULL << 7)))) {

This condition can be simplified by checking the return value of ept_rsvd_mask.
If it includes 0x38, this is a large page.  Otherwise it is a leaf page and
you can go down the "if".

Paolo

>  			u64 ept_mem_type = (spte & 0x38) >> 3;
>  
>  			if (ept_mem_type == 2 || ept_mem_type == 3 ||
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Aug. 18, 2014, 10:52 a.m. UTC | #2
On 08/18/2014 05:50 PM, Wanpeng Li wrote:
> EPT misconfig handler in kvm will check which reason lead to EPT 
> misconfiguration after vmexit. One of the reasons is that an EPT 
> paging-structure entry is configured with settings reserved for 
> future functionality. However, the handler can't identify if 
> paging-structure entry of reserved bits for 1-GByte page are 
> configured, since PDPTE which point to 1-GByte page will reserve 
> bits 29:12 instead of bits 7:3 which are reserved for PDPTE that 
> references an EPT Page Directory. This patch fix it by reserve 
> bits 29:12 for 1-GByte page. 

That mask is only set in the lowest pte for 4K page, i think it
is not a problem, no?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Aug. 18, 2014, 11:18 a.m. UTC | #3
Il 18/08/2014 12:52, Xiao Guangrong ha scritto:
>> > EPT misconfig handler in kvm will check which reason lead to EPT 
>> > misconfiguration after vmexit. One of the reasons is that an EPT 
>> > paging-structure entry is configured with settings reserved for 
>> > future functionality. However, the handler can't identify if 
>> > paging-structure entry of reserved bits for 1-GByte page are 
>> > configured, since PDPTE which point to 1-GByte page will reserve 
>> > bits 29:12 instead of bits 7:3 which are reserved for PDPTE that 
>> > references an EPT Page Directory. This patch fix it by reserve 
>> > bits 29:12 for 1-GByte page. 
> That mask is only set in the lowest pte for 4K page, i think it
> is not a problem, no?

It will cause KVM to WARN.  The EPT memory type will also be ignored for
gigabyte pages.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wanpeng Li Aug. 19, 2014, 2:17 a.m. UTC | #4
Hi Paolo,
On Mon, Aug 18, 2014 at 12:18:59PM +0200, Paolo Bonzini wrote:
>Il 18/08/2014 11:50, Wanpeng Li ha scritto:
>> EPT misconfig handler in kvm will check which reason lead to EPT 
>> misconfiguration after vmexit. One of the reasons is that an EPT 
>> paging-structure entry is configured with settings reserved for 
>> future functionality. However, the handler can't identify if 
>> paging-structure entry of reserved bits for 1-GByte page are 
>> configured, since PDPTE which point to 1-GByte page will reserve 
>> bits 29:12 instead of bits 7:3 which are reserved for PDPTE that 
>> references an EPT Page Directory. This patch fix it by reserve 
>> bits 29:12 for 1-GByte page. 
>> 
>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>> ---
>>  arch/x86/kvm/vmx.c | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index bfe11cf..71cbee5 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -5521,9 +5521,14 @@ static u64 ept_rsvd_mask(u64 spte, int level)
>>  	for (i = 51; i > boot_cpu_data.x86_phys_bits; i--)
>>  		mask |= (1ULL << i);
>>  
>> -	if (level > 2)
>> -		/* bits 7:3 reserved */
>> -		mask |= 0xf8;
>> +	if (level > 2) {
>
>level can be 4 here.  You have to return 0xf8 for level == 4.
>
>The same "if" statement then can cover both 2MB and 1GB pages, like
>
>                if (spte & (1ULL << 7))
>                        /* 1GB/2MB page, bits 29:12 or 20:12 reserved respectively */
>                        mask |= (PAGE_SIZE << ((level - 1) * 9)) - PAGE_SIZE;
>                else
>                        /* bits 6:3 reserved */
>                        mask |= 0x78;
>
>> -		if (level == 1 || (level == 2 && (spte & (1ULL << 7)))) {
>> +		if (level == 1 || ((level == 3 || level == 2)
>> +				&& (spte & (1ULL << 7)))) {
>
>This condition can be simplified by checking the return value of ept_rsvd_mask.
>If it includes 0x38, this is a large page.  Otherwise it is a leaf page and
>you can go down the "if".

As you know, 5:3 bits which used for EPT MT are not reserved bits, so 
I fail to understand why check the return value of ept_rsvd_mask and 
it's a large page if includes 0x38. Could you eplain in more details? ;-)

Regards,
Wanpeng Li 

>
>Paolo
>
>>  			u64 ept_mem_type = (spte & 0x38) >> 3;
>>  
>>  			if (ept_mem_type == 2 || ept_mem_type == 3 ||
>> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Aug. 19, 2014, 6:11 a.m. UTC | #5
Il 19/08/2014 04:17, Wanpeng Li ha scritto:
>>> >> -		if (level == 1 || (level == 2 && (spte & (1ULL << 7)))) {
>>> >> +		if (level == 1 || ((level == 3 || level == 2)
>>> >> +				&& (spte & (1ULL << 7)))) {
>> >
>> >This condition can be simplified by checking the return value of ept_rsvd_mask.
>> >If it includes 0x38, this is a large page.

Oops, a "not" was missing. If it includes 0x38, this is _not_ a large
page (it is a page directory / page directory pointer / PML4).

>> Otherwise it is a leaf page and
>> you can go down the "if".
> As you know, 5:3 bits which used for EPT MT are not reserved bits, so 
> I fail to understand why check the return value of ept_rsvd_mask and 
> it's a large page if includes 0x38. Could you eplain in more details? ;-)

A non-leaf page will always have 0x38 in the ept_rsvd_mask.  A leaf page
will never have 0x38 in the ept_rsvd_mask.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index bfe11cf..71cbee5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5521,9 +5521,14 @@  static u64 ept_rsvd_mask(u64 spte, int level)
 	for (i = 51; i > boot_cpu_data.x86_phys_bits; i--)
 		mask |= (1ULL << i);
 
-	if (level > 2)
-		/* bits 7:3 reserved */
-		mask |= 0xf8;
+	if (level > 2) {
+		if (spte & (1 << 7))
+			/* 1GB ref, bits 29:12 */
+			mask |= 0x3ffff000;
+		else
+			/* bits 7:3 reserved */
+			mask |= 0xf8;
+	}
 	else if (level == 2) {
 		if (spte & (1ULL << 7))
 			/* 2MB ref, bits 20:12 reserved */
@@ -5561,7 +5566,8 @@  static void ept_misconfig_inspect_spte(struct kvm_vcpu *vcpu, u64 spte,
 			WARN_ON(1);
 		}
 
-		if (level == 1 || (level == 2 && (spte & (1ULL << 7)))) {
+		if (level == 1 || ((level == 3 || level == 2)
+				&& (spte & (1ULL << 7)))) {
 			u64 ept_mem_type = (spte & 0x38) >> 3;
 
 			if (ept_mem_type == 2 || ept_mem_type == 3 ||