do not free active mmu pages in free_mmu_pages()

Message ID	20090311100755.GA19724@redhat.com (mailing list archive)
State	Accepted
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n2BA7jxj014884 for <patchwork-kvm@patchwork.kernel.org>; Wed, 11 Mar 2009 10:11:31 GMT Date: Wed, 11 Mar 2009 12:07:55 +0200 From: Gleb Natapov <gleb@redhat.com> To: avi@redhat.com, marcelo@redhat.com Cc: kvm@vger.kernel.org Subject: [PATCH] do not free active mmu pages in free_mmu_pages() Message-ID: <20090311100755.GA19724@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: kvm-owner@vger.kernel.org Precedence: bulk

Gleb Natapov March 11, 2009, 10:07 a.m. UTC

free_mmu_pages() should only undo what alloc_mmu_pages() does.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Avi Kivity March 15, 2009, 12:59 p.m. UTC | #1

Gleb Natapov wrote:
> free_mmu_pages() should only undo what alloc_mmu_pages() does.
>
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 2a36f7f..b625ed4 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2638,14 +2638,6 @@ EXPORT_SYMBOL_GPL(kvm_disable_tdp);
>  
>  static void free_mmu_pages(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm_mmu_page *sp;
> -
> -	while (!list_empty(&vcpu->kvm->arch.active_mmu_pages)) {
> -		sp = container_of(vcpu->kvm->arch.active_mmu_pages.next,
> -				  struct kvm_mmu_page, link);
> -		kvm_mmu_zap_page(vcpu->kvm, sp);
> -		cond_resched();
> -	}
>  	free_page((unsigned long)vcpu->arch.mmu.pae_root);
>  }
>  
>   

I think this is correct, but the patch leaves the function name wrong.  
Rename, or perhaps just open code into callers?

Marcelo Tosatti March 16, 2009, 8:15 p.m. UTC | #2

On Wed, Mar 11, 2009 at 12:07:55PM +0200, Gleb Natapov wrote:
> free_mmu_pages() should only undo what alloc_mmu_pages() does.
> 
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 2a36f7f..b625ed4 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2638,14 +2638,6 @@ EXPORT_SYMBOL_GPL(kvm_disable_tdp);
>  
>  static void free_mmu_pages(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm_mmu_page *sp;
> -
> -	while (!list_empty(&vcpu->kvm->arch.active_mmu_pages)) {
> -		sp = container_of(vcpu->kvm->arch.active_mmu_pages.next,
> -				  struct kvm_mmu_page, link);
> -		kvm_mmu_zap_page(vcpu->kvm, sp);
> -		cond_resched();
> -	}
>  	free_page((unsigned long)vcpu->arch.mmu.pae_root);
>  }

Doesnt the vm shutdown path rely on the while loop you removed to free
all shadow pages before freeing the mmu kmem caches, if mmu notifiers
is disabled?

And how harmful is that loop? Zaps the entire cache on cpu hotunplug?


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov March 16, 2009, 8:34 p.m. UTC | #3

On Mon, Mar 16, 2009 at 05:15:33PM -0300, Marcelo Tosatti wrote:
> On Wed, Mar 11, 2009 at 12:07:55PM +0200, Gleb Natapov wrote:
> > free_mmu_pages() should only undo what alloc_mmu_pages() does.
> > 
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index 2a36f7f..b625ed4 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -2638,14 +2638,6 @@ EXPORT_SYMBOL_GPL(kvm_disable_tdp);
> >  
> >  static void free_mmu_pages(struct kvm_vcpu *vcpu)
> >  {
> > -	struct kvm_mmu_page *sp;
> > -
> > -	while (!list_empty(&vcpu->kvm->arch.active_mmu_pages)) {
> > -		sp = container_of(vcpu->kvm->arch.active_mmu_pages.next,
> > -				  struct kvm_mmu_page, link);
> > -		kvm_mmu_zap_page(vcpu->kvm, sp);
> > -		cond_resched();
> > -	}
> >  	free_page((unsigned long)vcpu->arch.mmu.pae_root);
> >  }
> 
> Doesnt the vm shutdown path rely on the while loop you removed to free
> all shadow pages before freeing the mmu kmem caches, if mmu notifiers
> is disabled?
> 
Shouldn't mmu_free_roots() on all vcpus clear all mmu pages?


> And how harmful is that loop? Zaps the entire cache on cpu hotunplug?
> 
KVM doesn't support vcpu destruction, but destruction is called anyway
on various error conditions. The one that easy to trigger is to create
vcpu with the same id simultaneously from two threads. The result is
OOPs in random places.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marcelo Tosatti March 16, 2009, 9:01 p.m. UTC | #4

On Mon, Mar 16, 2009 at 10:34:01PM +0200, Gleb Natapov wrote:
> > Doesnt the vm shutdown path rely on the while loop you removed to free
> > all shadow pages before freeing the mmu kmem caches, if mmu notifiers
> > is disabled?
> > 
> Shouldn't mmu_free_roots() on all vcpus clear all mmu pages?

No. It only zaps the present root on every vcpu, but not 
the children.

> > And how harmful is that loop? Zaps the entire cache on cpu hotunplug?
> > 
> KVM doesn't support vcpu destruction, but destruction is called anyway
> on various error conditions. The one that easy to trigger is to create
> vcpu with the same id simultaneously from two threads. The result is
> OOPs in random places.

mmu_lock should be held there, and apparently it is not.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov March 16, 2009, 9:20 p.m. UTC | #5

On Mon, Mar 16, 2009 at 06:01:52PM -0300, Marcelo Tosatti wrote:
> On Mon, Mar 16, 2009 at 10:34:01PM +0200, Gleb Natapov wrote:
> > > Doesnt the vm shutdown path rely on the while loop you removed to free
> > > all shadow pages before freeing the mmu kmem caches, if mmu notifiers
> > > is disabled?
> > > 
> > Shouldn't mmu_free_roots() on all vcpus clear all mmu pages?
> 
> No. It only zaps the present root on every vcpu, but not 
> the children.
> 
> > > And how harmful is that loop? Zaps the entire cache on cpu hotunplug?
> > > 
> > KVM doesn't support vcpu destruction, but destruction is called anyway
> > on various error conditions. The one that easy to trigger is to create
> > vcpu with the same id simultaneously from two threads. The result is
> > OOPs in random places.
> 
> mmu_lock should be held there, and apparently it is not.
> 
Yeah, my first solution was to add mmu_lock, but why function that gets
vcpu as an input should destroy data structure that is global for the VM.
There is kvm_mmu_zap_all() that does same thing (well almost) and also does
proper locking. Shouldn't it be called during VM destruction instead?

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov March 16, 2009, 9:32 p.m. UTC | #6

On Mon, Mar 16, 2009 at 06:33:22PM -0300, Marcelo Tosatti wrote:
> On Mon, Mar 16, 2009 at 11:20:10PM +0200, Gleb Natapov wrote:
> > > mmu_lock should be held there, and apparently it is not.
> > > 
> > Yeah, my first solution was to add mmu_lock, but why function that gets
> > vcpu as an input should destroy data structure that is global for the VM.
> 
> Point.
> 
> > There is kvm_mmu_zap_all() that does same thing (well almost) and also does
> > proper locking. Shouldn't it be called during VM destruction instead?
> 
> Yes, that would better (which happens implicitly with mmu notifiers
> ->release).
OK. I'll send new patch.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marcelo Tosatti March 16, 2009, 9:33 p.m. UTC | #7

On Mon, Mar 16, 2009 at 11:20:10PM +0200, Gleb Natapov wrote:
> > mmu_lock should be held there, and apparently it is not.
> > 
> Yeah, my first solution was to add mmu_lock, but why function that gets
> vcpu as an input should destroy data structure that is global for the VM.

Point.

> There is kvm_mmu_zap_all() that does same thing (well almost) and also does
> proper locking. Shouldn't it be called during VM destruction instead?

Yes, that would better (which happens implicitly with mmu notifiers
->release).

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

do not free active mmu pages in free_mmu_pages()

Commit Message

Comments

Patch