diff mbox

[v2,03/12] KVM: MMU: simplify mmu_set_spte

Message ID 50FFB5E9.2010500@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong Jan. 23, 2013, 10:05 a.m. UTC
In order to detecting spte remapping, we can simply check whether the
spte has already been pointing to the pfn even if the spte is not the
last spte, for middle spte is pointing to the kernel pfn which can not
be mapped to userspace

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c |   28 +++++++---------------------
 1 files changed, 7 insertions(+), 21 deletions(-)

Comments

Marcelo Tosatti Jan. 29, 2013, 12:21 a.m. UTC | #1
On Wed, Jan 23, 2013 at 06:05:29PM +0800, Xiao Guangrong wrote:
> In order to detecting spte remapping, we can simply check whether the
> spte has already been pointing to the pfn even if the spte is not the
> last spte, for middle spte is pointing to the kernel pfn which can not
> be mapped to userspace

This check is detecting spte overwrite, when a large spte is replaced by 
pointer to spte table.

Can't see why check for different pfn is safe: only 'int level' can
differ, and pfn be equivalent, for example.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Jan. 29, 2013, 2:55 a.m. UTC | #2
On 01/29/2013 08:21 AM, Marcelo Tosatti wrote:
> On Wed, Jan 23, 2013 at 06:05:29PM +0800, Xiao Guangrong wrote:
>> In order to detecting spte remapping, we can simply check whether the
>> spte has already been pointing to the pfn even if the spte is not the
>> last spte, for middle spte is pointing to the kernel pfn which can not
>> be mapped to userspace
> 
> This check is detecting spte overwrite, when a large spte is replaced by 
> pointer to spte table.
> 
> Can't see why check for different pfn is safe: only 'int level' can
> differ, and pfn be equivalent, for example.

The 'u64 *sptep' must on the "int level" we want to set, that means:
 page_header(__pa(sptep)).role.level == "int level".


We discussed this before :), the discussion can be found at:
http://marc.info/?l=kvm&m=135345057329427&w=2.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Jan. 29, 2013, 9:53 p.m. UTC | #3
On Tue, Jan 29, 2013 at 10:55:24AM +0800, Xiao Guangrong wrote:
> On 01/29/2013 08:21 AM, Marcelo Tosatti wrote:
> > On Wed, Jan 23, 2013 at 06:05:29PM +0800, Xiao Guangrong wrote:
> >> In order to detecting spte remapping, we can simply check whether the
> >> spte has already been pointing to the pfn even if the spte is not the
> >> last spte, for middle spte is pointing to the kernel pfn which can not
> >> be mapped to userspace
> > 
> > This check is detecting spte overwrite, when a large spte is replaced by 
> > pointer to spte table.
> > 
> > Can't see why check for different pfn is safe: only 'int level' can
> > differ, and pfn be equivalent, for example.
> 
> The 'u64 *sptep' must on the "int level" we want to set, that means:
>  page_header(__pa(sptep)).role.level == "int level".

Right, then stick a comment there noting which cases that condition handles.
Keep the current comment and add more.

> We discussed this before :), the discussion can be found at:
> http://marc.info/?l=kvm&m=135345057329427&w=2.

Note http://marc.info/?l=kvm&m=135345059929436&w=2, please take into
account in the future.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Jan. 30, 2013, 3:22 a.m. UTC | #4
On 01/30/2013 05:53 AM, Marcelo Tosatti wrote:
> On Tue, Jan 29, 2013 at 10:55:24AM +0800, Xiao Guangrong wrote:
>> On 01/29/2013 08:21 AM, Marcelo Tosatti wrote:
>>> On Wed, Jan 23, 2013 at 06:05:29PM +0800, Xiao Guangrong wrote:
>>>> In order to detecting spte remapping, we can simply check whether the
>>>> spte has already been pointing to the pfn even if the spte is not the
>>>> last spte, for middle spte is pointing to the kernel pfn which can not
>>>> be mapped to userspace
>>>
>>> This check is detecting spte overwrite, when a large spte is replaced by 
>>> pointer to spte table.
>>>
>>> Can't see why check for different pfn is safe: only 'int level' can
>>> differ, and pfn be equivalent, for example.
>>
>> The 'u64 *sptep' must on the "int level" we want to set, that means:
>>  page_header(__pa(sptep)).role.level == "int level".
> 
> Right, then stick a comment there noting which cases that condition handles.
> Keep the current comment and add more.
> 

Okay.

>> We discussed this before :), the discussion can be found at:
>> http://marc.info/?l=kvm&m=135345057329427&w=2.
> 
> Note http://marc.info/?l=kvm&m=135345059929436&w=2, please take into
> account in the future.

Okay, i will make that patch more simpler in the next version.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8dca8af..a999755 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2407,33 +2407,20 @@  static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 			 int write_fault, int *emulate, int level, gfn_t gfn,
 			 pfn_t pfn, bool speculative, bool host_writable)
 {
-	int was_rmapped = 0;
-	int rmap_count;
+	bool was_rmapped = false;

 	pgprintk("%s: spte %llx access %x write_fault %d gfn %llx\n",
 		 __func__, *sptep, pt_access,
 		 write_fault, gfn);

 	if (is_rmap_spte(*sptep)) {
-		/*
-		 * If we overwrite a PTE page pointer with a 2MB PMD, unlink
-		 * the parent of the now unreachable PTE.
-		 */
-		if (level > PT_PAGE_TABLE_LEVEL &&
-		    !is_large_pte(*sptep)) {
-			struct kvm_mmu_page *child;
-			u64 pte = *sptep;
+		if (pfn != spte_to_pfn(*sptep)) {
+			struct kvm_mmu_page *sp = page_header(__pa(sptep));

-			child = page_header(pte & PT64_BASE_ADDR_MASK);
-			drop_parent_pte(child, sptep);
-			kvm_flush_remote_tlbs(vcpu->kvm);
-		} else if (pfn != spte_to_pfn(*sptep)) {
-			pgprintk("hfn old %llx new %llx\n",
-				 spte_to_pfn(*sptep), pfn);
-			drop_spte(vcpu->kvm, sptep);
-			kvm_flush_remote_tlbs(vcpu->kvm);
+			if (mmu_page_zap_pte(vcpu->kvm, sp, sptep))
+				kvm_flush_remote_tlbs(vcpu->kvm);
 		} else
-			was_rmapped = 1;
+			was_rmapped = true;
 	}

 	if (set_spte(vcpu, sptep, pte_access, level, gfn, pfn, speculative,
@@ -2456,8 +2443,7 @@  static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,

 	if (is_shadow_present_pte(*sptep)) {
 		if (!was_rmapped) {
-			rmap_count = rmap_add(vcpu, sptep, gfn);
-			if (rmap_count > RMAP_RECYCLE_THRESHOLD)
+			if (rmap_add(vcpu, sptep, gfn) > RMAP_RECYCLE_THRESHOLD)
 				rmap_recycle(vcpu, sptep, gfn);
 		}
 	}