diff mbox series

[v4,2/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in kvm_set_pte_rmapp()

Message ID 0ce24d7078fa5f1f8d64b0c59826c50f32f8065e.1665214747.git.houwenlong.hwl@antgroup.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Fix wrong usages of range-based tlb flushing | expand

Commit Message

Hou Wenlong Oct. 10, 2022, 12:19 p.m. UTC
When the spte of hupe page is dropped in kvm_set_pte_rmapp(), the whole
gfn range covered by the spte should be flushed. However,
rmap_walk_init_level() doesn't align down the gfn for new level like tdp
iterator does, then the gfn used in kvm_set_pte_rmapp() is not the base
gfn of huge page. And the size of gfn range is wrong too for huge page.
Use the base gfn of huge page and the size of huge page for flushing
tlbs for huge page. Also introduce a helper function to flush the given
page (huge or not) of guest memory, which would help prevent future
buggy use of kvm_flush_remote_tlbs_with_address() in such case.

Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
 arch/x86/kvm/mmu/mmu.c          |  4 +++-
 arch/x86/kvm/mmu/mmu_internal.h | 10 ++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

Comments

Sean Christopherson Oct. 12, 2022, 4:46 p.m. UTC | #1
On Mon, Oct 10, 2022, Hou Wenlong wrote:
> When the spte of hupe page is dropped in kvm_set_pte_rmapp(), the whole
> gfn range covered by the spte should be flushed. However,
> rmap_walk_init_level() doesn't align down the gfn for new level like tdp
> iterator does, then the gfn used in kvm_set_pte_rmapp() is not the base
> gfn of huge page. And the size of gfn range is wrong too for huge page.
> Use the base gfn of huge page and the size of huge page for flushing
> tlbs for huge page. Also introduce a helper function to flush the given
> page (huge or not) of guest memory, which would help prevent future
> buggy use of kvm_flush_remote_tlbs_with_address() in such case.
> 
> Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c          |  4 +++-
>  arch/x86/kvm/mmu/mmu_internal.h | 10 ++++++++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7de3579d5a27..4874c603ed1c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1430,7 +1430,9 @@ static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
>  	}
>  
>  	if (need_flush && kvm_available_flush_tlb_with_range()) {
> -		kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
> +		gfn_t base = gfn_round_for_level(gfn, level);
> +
> +		kvm_flush_remote_tlbs_gfn(kvm, base, level);
>  		return false;
>  	}
>  
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index 17488d70f7da..249bfcd502b4 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -168,8 +168,18 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
>  bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
>  				    struct kvm_memory_slot *slot, u64 gfn,
>  				    int min_level);
> +
>  void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
>  					u64 start_gfn, u64 pages);
> +
> +/* Flush the given page (huge or not) of guest memory. */
> +static inline void kvm_flush_remote_tlbs_gfn(struct kvm *kvm, gfn_t gfn, int level)
> +{
> +	u64 pages = KVM_PAGES_PER_HPAGE(level);
> +

Rather than require the caller to align gfn, what about doing gfn_round_for_level()
in this helper?  It's a little odd that the caller needs to align gfn but doesn't
have to compute the size.

I'm 99% certain kvm_set_pte_rmap() is the only path that doesn't already align the
gfn, but it's nice to not have to worry about getting this right, e.g. alternatively
this helper could WARN if the gfn is misaligned, but that's _more work.

	kvm_flush_remote_tlbs_with_address(kvm, gfn_round_for_level(gfn, level),
					   KVM_PAGES_PER_HPAGE(level);

If no one objects, this can be done when the series is applied, i.e. no need to
send v5 just for this.


> +	kvm_flush_remote_tlbs_with_address(kvm, gfn, pages);
> +}
> +
>  unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
>  
>  extern int nx_huge_pages;
> -- 
> 2.31.1
>
Lai Jiangshan Dec. 14, 2022, 3:07 p.m. UTC | #2
On Thu, Oct 13, 2022 at 1:00 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Oct 10, 2022, Hou Wenlong wrote:
> > When the spte of hupe page is dropped in kvm_set_pte_rmapp(), the whole
> > gfn range covered by the spte should be flushed. However,
> > rmap_walk_init_level() doesn't align down the gfn for new level like tdp
> > iterator does, then the gfn used in kvm_set_pte_rmapp() is not the base
> > gfn of huge page. And the size of gfn range is wrong too for huge page.
> > Use the base gfn of huge page and the size of huge page for flushing
> > tlbs for huge page. Also introduce a helper function to flush the given
> > page (huge or not) of guest memory, which would help prevent future
> > buggy use of kvm_flush_remote_tlbs_with_address() in such case.
> >
> > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new one to flush a specified range.")
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c          |  4 +++-
> >  arch/x86/kvm/mmu/mmu_internal.h | 10 ++++++++++
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7de3579d5a27..4874c603ed1c 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1430,7 +1430,9 @@ static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> >       }
> >
> >       if (need_flush && kvm_available_flush_tlb_with_range()) {
> > -             kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
> > +             gfn_t base = gfn_round_for_level(gfn, level);
> > +
> > +             kvm_flush_remote_tlbs_gfn(kvm, base, level);
> >               return false;
> >       }
> >
> > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> > index 17488d70f7da..249bfcd502b4 100644
> > --- a/arch/x86/kvm/mmu/mmu_internal.h
> > +++ b/arch/x86/kvm/mmu/mmu_internal.h
> > @@ -168,8 +168,18 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
> >  bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
> >                                   struct kvm_memory_slot *slot, u64 gfn,
> >                                   int min_level);
> > +
> >  void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
> >                                       u64 start_gfn, u64 pages);
> > +
> > +/* Flush the given page (huge or not) of guest memory. */
> > +static inline void kvm_flush_remote_tlbs_gfn(struct kvm *kvm, gfn_t gfn, int level)
> > +{
> > +     u64 pages = KVM_PAGES_PER_HPAGE(level);
> > +
>
> Rather than require the caller to align gfn, what about doing gfn_round_for_level()
> in this helper?  It's a little odd that the caller needs to align gfn but doesn't
> have to compute the size.
>
> I'm 99% certain kvm_set_pte_rmap() is the only path that doesn't already align the
> gfn, but it's nice to not have to worry about getting this right, e.g. alternatively
> this helper could WARN if the gfn is misaligned, but that's _more work.
>
>         kvm_flush_remote_tlbs_with_address(kvm, gfn_round_for_level(gfn, level),
>                                            KVM_PAGES_PER_HPAGE(level);
>
> If no one objects, this can be done when the series is applied, i.e. no need to
> send v5 just for this.
>

Hello Paolo, Sean, Hou,

It seems the patchset has not been queued.  I believe it does
fix bugs.

Thanks
Lai
Sean Christopherson Dec. 14, 2022, 7:11 p.m. UTC | #3
On Wed, Dec 14, 2022, Lai Jiangshan wrote:
> On Thu, Oct 13, 2022 at 1:00 AM Sean Christopherson <seanjc@google.com> wrote:
> > > +/* Flush the given page (huge or not) of guest memory. */
> > > +static inline void kvm_flush_remote_tlbs_gfn(struct kvm *kvm, gfn_t gfn, int level)
> > > +{
> > > +     u64 pages = KVM_PAGES_PER_HPAGE(level);
> > > +
> >
> > Rather than require the caller to align gfn, what about doing gfn_round_for_level()
> > in this helper?  It's a little odd that the caller needs to align gfn but doesn't
> > have to compute the size.
> >
> > I'm 99% certain kvm_set_pte_rmap() is the only path that doesn't already align the
> > gfn, but it's nice to not have to worry about getting this right, e.g. alternatively
> > this helper could WARN if the gfn is misaligned, but that's _more work.
> >
> >         kvm_flush_remote_tlbs_with_address(kvm, gfn_round_for_level(gfn, level),
> >                                            KVM_PAGES_PER_HPAGE(level);
> >
> > If no one objects, this can be done when the series is applied, i.e. no need to
> > send v5 just for this.
> >
> 
> Hello Paolo, Sean, Hou,
> 
> It seems the patchset has not been queued.  I believe it does
> fix bugs.

It's on my list of things to get merged for 6.3.  I haven't been more agressive
in getting it queued because I assume there are very few KVM-on-HyperV users that
are likely to be affected.
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7de3579d5a27..4874c603ed1c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1430,7 +1430,9 @@  static bool kvm_set_pte_rmap(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 	}
 
 	if (need_flush && kvm_available_flush_tlb_with_range()) {
-		kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
+		gfn_t base = gfn_round_for_level(gfn, level);
+
+		kvm_flush_remote_tlbs_gfn(kvm, base, level);
 		return false;
 	}
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 17488d70f7da..249bfcd502b4 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -168,8 +168,18 @@  void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 				    struct kvm_memory_slot *slot, u64 gfn,
 				    int min_level);
+
 void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 					u64 start_gfn, u64 pages);
+
+/* Flush the given page (huge or not) of guest memory. */
+static inline void kvm_flush_remote_tlbs_gfn(struct kvm *kvm, gfn_t gfn, int level)
+{
+	u64 pages = KVM_PAGES_PER_HPAGE(level);
+
+	kvm_flush_remote_tlbs_with_address(kvm, gfn, pages);
+}
+
 unsigned int pte_list_count(struct kvm_rmap_head *rmap_head);
 
 extern int nx_huge_pages;