Message ID | 20200121223157.15263-16-sean.j.christopherson@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Dynamically size memslot arrays | expand |
On Tue, Jan 21, 2020 at 02:31:53PM -0800, Sean Christopherson wrote: [...] > -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) > +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, > + struct kvm_memory_slot *memslot) If it's to flush TLB for a memslot, shall we remove the "dirty_log" in the name of the function, because it has nothing to do with dirty logging any more? And... > { > - struct kvm_memslots *slots; > - struct kvm_memory_slot *memslot; > - bool flush = false; > - int r; > - > - mutex_lock(&kvm->slots_lock); > - > - r = kvm_clear_dirty_log_protect(kvm, log, &flush); > - > - if (flush) { > - slots = kvm_memslots(kvm); > - memslot = id_to_memslot(slots, log->slot); > - > - /* Let implementation handle TLB/GVA invalidation */ > - kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); > - } > - > - mutex_unlock(&kvm->slots_lock); > - return r; > + /* Let implementation handle TLB/GVA invalidation */ > + kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); ... This may not directly related to the current patch, but I'm confused on why MIPS cannot use kvm_flush_remote_tlbs() to flush TLBs. I know nothing about MIPS code, but IIUC here flush_shadow_memslot() is a heavier operation that will also invalidate the shadow pages. Seems to be an overkill here when we only changed write permission of the PTEs? I tried to check the first occurance (2a31b9db15353) but I didn't find out any clue of it so far. But that matters to this patch because if MIPS can use kvm_flush_remote_tlbs(), then we probably don't need this arch-specific hook any more and we can directly call kvm_flush_remote_tlbs() after sync dirty log when flush==true. > } > > long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) > diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c > index 97ce6c4f7b48..0adaf4791a6d 100644 > --- a/arch/powerpc/kvm/book3s.c > +++ b/arch/powerpc/kvm/book3s.c > @@ -799,6 +799,11 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) > return vcpu->kvm->arch.kvm_ops->check_requests(vcpu); > } > > +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) Since at it, maybe we can start to use __weak attribute for new hooks especially when it's empty for most archs? E.g., define: void __weak kvm_arch_sync_dirty_log(...) {} In the common code, then only define it again in arch that has non-empty implementation of this method?
On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > On Tue, Jan 21, 2020 at 02:31:53PM -0800, Sean Christopherson wrote: > > [...] > > > -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) > > +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, > > + struct kvm_memory_slot *memslot) > > If it's to flush TLB for a memslot, shall we remove the "dirty_log" in > the name of the function, because it has nothing to do with dirty > logging any more? And... I kept the "dirty_log" to allow arch code to implement logic specific to a TLB flush during dirty logging, e.g. x86's lockdep assert on slots_lock. And similar to the issue with MIPS below, to deter usage of the hook for anything else, i.e. to nudge people to using kvm_flush_remote_tlbs() directly. > > { > > - struct kvm_memslots *slots; > > - struct kvm_memory_slot *memslot; > > - bool flush = false; > > - int r; > > - > > - mutex_lock(&kvm->slots_lock); > > - > > - r = kvm_clear_dirty_log_protect(kvm, log, &flush); > > - > > - if (flush) { > > - slots = kvm_memslots(kvm); > > - memslot = id_to_memslot(slots, log->slot); > > - > > - /* Let implementation handle TLB/GVA invalidation */ > > - kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); > > - } > > - > > - mutex_unlock(&kvm->slots_lock); > > - return r; > > + /* Let implementation handle TLB/GVA invalidation */ > > + kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); > > ... This may not directly related to the current patch, but I'm > confused on why MIPS cannot use kvm_flush_remote_tlbs() to flush TLBs. > I know nothing about MIPS code, but IIUC here flush_shadow_memslot() > is a heavier operation that will also invalidate the shadow pages. > Seems to be an overkill here when we only changed write permission of > the PTEs? I tried to check the first occurance (2a31b9db15353) but I > didn't find out any clue of it so far. > > But that matters to this patch because if MIPS can use > kvm_flush_remote_tlbs(), then we probably don't need this > arch-specific hook any more and we can directly call > kvm_flush_remote_tlbs() after sync dirty log when flush==true. Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing that prevents calling kvm_flush_remote_tlbs() directly, but I have no clue as to the important of that code. > > } > > > > long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) > > diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c > > index 97ce6c4f7b48..0adaf4791a6d 100644 > > --- a/arch/powerpc/kvm/book3s.c > > +++ b/arch/powerpc/kvm/book3s.c > > @@ -799,6 +799,11 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) > > return vcpu->kvm->arch.kvm_ops->check_requests(vcpu); > > } > > > > +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) > > Since at it, maybe we can start to use __weak attribute for new hooks > especially when it's empty for most archs? > > E.g., define: > > void __weak kvm_arch_sync_dirty_log(...) {} > > In the common code, then only define it again in arch that has > non-empty implementation of this method? I defer to Paolo, I'm indifferent at this stage.
On Tue, Jan 21, 2020 at 02:31:53PM -0800, Sean Christopherson wrote: [...] > @@ -1333,6 +1369,7 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, > unsigned long i, n; > unsigned long *dirty_bitmap; > unsigned long *dirty_bitmap_buffer; > + bool flush; > > as_id = log->slot >> 16; > id = (u16)log->slot; > @@ -1356,7 +1393,9 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, > (log->num_pages < memslot->npages - log->first_page && (log->num_pages & 63))) > return -EINVAL; > > - *flush = false; > + kvm_arch_sync_dirty_log(kvm, memslot); Do we need this even for clear dirty log? > + > + flush = false; > dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot); > if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n)) > return -EFAULT;
On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > On Tue, Jan 21, 2020 at 02:31:53PM -0800, Sean Christopherson wrote: > > > > [...] > > > > > -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) > > > +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, > > > + struct kvm_memory_slot *memslot) > > > > If it's to flush TLB for a memslot, shall we remove the "dirty_log" in > > the name of the function, because it has nothing to do with dirty > > logging any more? And... > > I kept the "dirty_log" to allow arch code to implement logic specific to a > TLB flush during dirty logging, e.g. x86's lockdep assert on slots_lock. > And similar to the issue with MIPS below, to deter usage of the hook for > anything else, i.e. to nudge people to using kvm_flush_remote_tlbs() > directly. The x86's lockdep assert is not that important afaict, since the two callers of the new tlb_flush() hook will be with slots_lock for sure. > > > > { > > > - struct kvm_memslots *slots; > > > - struct kvm_memory_slot *memslot; > > > - bool flush = false; > > > - int r; > > > - > > > - mutex_lock(&kvm->slots_lock); > > > - > > > - r = kvm_clear_dirty_log_protect(kvm, log, &flush); > > > - > > > - if (flush) { > > > - slots = kvm_memslots(kvm); > > > - memslot = id_to_memslot(slots, log->slot); > > > - > > > - /* Let implementation handle TLB/GVA invalidation */ > > > - kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); > > > - } > > > - > > > - mutex_unlock(&kvm->slots_lock); > > > - return r; > > > + /* Let implementation handle TLB/GVA invalidation */ > > > + kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); > > > > ... This may not directly related to the current patch, but I'm > > confused on why MIPS cannot use kvm_flush_remote_tlbs() to flush TLBs. > > I know nothing about MIPS code, but IIUC here flush_shadow_memslot() > > is a heavier operation that will also invalidate the shadow pages. > > Seems to be an overkill here when we only changed write permission of > > the PTEs? I tried to check the first occurance (2a31b9db15353) but I > > didn't find out any clue of it so far. > > > > But that matters to this patch because if MIPS can use > > kvm_flush_remote_tlbs(), then we probably don't need this > > arch-specific hook any more and we can directly call > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > clue as to the important of that code. As said above I think the x86 lockdep is really not necessary, then considering MIPS could be the only one that will use the new hook introduced in this patch... Shall we figure that out first? Thanks,
+Vitaly for HyperV On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > > But that matters to this patch because if MIPS can use > > > kvm_flush_remote_tlbs(), then we probably don't need this > > > arch-specific hook any more and we can directly call > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > > clue as to the important of that code. > > As said above I think the x86 lockdep is really not necessary, then > considering MIPS could be the only one that will use the new hook > introduced in this patch... Shall we figure that out first? So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, but then I realized x86 *has* a hook to do a precise remote TLB flush. There's even an existing kvm_flush_remote_tlbs_with_address() call on a memslot, i.e. this exact scenario. So arguably, x86 should be using the more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). But, the hook is only used when KVM is running as an L1 on top of HyperV, and I assume dirty logging isn't used much, if at all, for L1 KVM on HyperV? I see three options: 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call kvm_flush_remote_tlbs() directly for arm and x86. Add comments to explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing a memslot after the dirty log is grabbed by userspace. 3. Keep the resulting code as is, but add a comment in x86's kvm_arch_dirty_log_tlb_flush() to explain why it uses kvm_flush_remote_tlbs() instead of the with_address() variant. I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of those is preferable. I don't like (1) because (a) it requires more lines code (well comments), to explain why kvm_flush_remote_tlbs() is the default, and (b) it would require even more comments, which would be x86-specific in generic KVM, to explain why x86 doesn't use its with_address() flush, or we'd lost that info altogether.
On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote: > +Vitaly for HyperV > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > > > But that matters to this patch because if MIPS can use > > > > kvm_flush_remote_tlbs(), then we probably don't need this > > > > arch-specific hook any more and we can directly call > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > > > clue as to the important of that code. > > > > As said above I think the x86 lockdep is really not necessary, then > > considering MIPS could be the only one that will use the new hook > > introduced in this patch... Shall we figure that out first? > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, > but then I realized x86 *has* a hook to do a precise remote TLB flush. > There's even an existing kvm_flush_remote_tlbs_with_address() call on a > memslot, i.e. this exact scenario. So arguably, x86 should be using the > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). > > But, the hook is only used when KVM is running as an L1 on top of HyperV, > and I assume dirty logging isn't used much, if at all, for L1 KVM on > HyperV? > > I see three options: > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing > a memslot after the dirty log is grabbed by userspace. > > 3. Keep the resulting code as is, but add a comment in x86's > kvm_arch_dirty_log_tlb_flush() to explain why it uses > kvm_flush_remote_tlbs() instead of the with_address() variant. > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of > those is preferable. > > I don't like (1) because (a) it requires more lines code (well comments), > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would > require even more comments, which would be x86-specific in generic KVM, > to explain why x86 doesn't use its with_address() flush, or we'd lost that > info altogether. > I proposed the 4th solution here: https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/ I'm not sure whether that's acceptable, but if it can, then we can drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to per-slot tlb flushing. Thanks,
On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote: > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote: > > +Vitaly for HyperV > > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > > > > But that matters to this patch because if MIPS can use > > > > > kvm_flush_remote_tlbs(), then we probably don't need this > > > > > arch-specific hook any more and we can directly call > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > > > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > > > > clue as to the important of that code. > > > > > > As said above I think the x86 lockdep is really not necessary, then > > > considering MIPS could be the only one that will use the new hook > > > introduced in this patch... Shall we figure that out first? > > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, > > but then I realized x86 *has* a hook to do a precise remote TLB flush. > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a > > memslot, i.e. this exact scenario. So arguably, x86 should be using the > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). > > > > But, the hook is only used when KVM is running as an L1 on top of HyperV, > > and I assume dirty logging isn't used much, if at all, for L1 KVM on > > HyperV? > > > > I see three options: > > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). > > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing > > a memslot after the dirty log is grabbed by userspace. > > > > 3. Keep the resulting code as is, but add a comment in x86's > > kvm_arch_dirty_log_tlb_flush() to explain why it uses > > kvm_flush_remote_tlbs() instead of the with_address() variant. > > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of > > those is preferable. > > > > I don't like (1) because (a) it requires more lines code (well comments), > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would > > require even more comments, which would be x86-specific in generic KVM, > > to explain why x86 doesn't use its with_address() flush, or we'd lost that > > info altogether. > > > > I proposed the 4th solution here: > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/ > > I'm not sure whether that's acceptable, but if it can, then we can > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to > per-slot tlb flushing. This effectively is per-slot TLB flushing, it just has a different name. I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot. I'm not opposed to that name change. And on second and third glance, I probably prefer it. That would more or less follow the naming of kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot(). I don't want to go straight to kvm_arch_flush_remote_tlb_with_address() because that loses the important distinction (on x86) that slots_lock is expected to be held.
On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote: > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote: > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote: > > > +Vitaly for HyperV > > > > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > > > > > But that matters to this patch because if MIPS can use > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this > > > > > > arch-specific hook any more and we can directly call > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > > > > > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > > > > > clue as to the important of that code. > > > > > > > > As said above I think the x86 lockdep is really not necessary, then > > > > considering MIPS could be the only one that will use the new hook > > > > introduced in this patch... Shall we figure that out first? > > > > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, > > > but then I realized x86 *has* a hook to do a precise remote TLB flush. > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a > > > memslot, i.e. this exact scenario. So arguably, x86 should be using the > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). > > > > > > But, the hook is only used when KVM is running as an L1 on top of HyperV, > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on > > > HyperV? > > > > > > I see three options: > > > > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call > > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to > > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). > > > > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing > > > a memslot after the dirty log is grabbed by userspace. > > > > > > 3. Keep the resulting code as is, but add a comment in x86's > > > kvm_arch_dirty_log_tlb_flush() to explain why it uses > > > kvm_flush_remote_tlbs() instead of the with_address() variant. > > > > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of > > > those is preferable. > > > > > > I don't like (1) because (a) it requires more lines code (well comments), > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would > > > require even more comments, which would be x86-specific in generic KVM, > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that > > > info altogether. > > > > > > > I proposed the 4th solution here: > > > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/ > > > > I'm not sure whether that's acceptable, but if it can, then we can > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to > > per-slot tlb flushing. > > This effectively is per-slot TLB flushing, it just has a different name. > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot. > I'm not opposed to that name change. And on second and third glance, I > probably prefer it. That would more or less follow the naming of > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot(). Note that the major point of the above patchset is not about doing tlb flush per-memslot or globally. It's more about whether we can provide a common entrance for TLB flushing. Say, after that series, we should be able to flush TLB on all archs (majorly, including MIPS) as: kvm_flush_remote_tlbs(kvm); And with the same idea we can also introduce the ranged version. > > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address() > because that loses the important distinction (on x86) that slots_lock is > expected to be held. Sorry I'm still puzzled on why that lockdep is so important and special for x86... For example, what if we move that lockdep to the callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects even more arch (where we do get/clear dirty log)? IMHO the callers must be with the slots_lock held anyways no matter for x86 or not. Thanks,
On Fri, Feb 07, 2020 at 07:53:34PM -0500, Peter Xu wrote: > On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote: > > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote: > > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote: > > > > +Vitaly for HyperV > > > > > > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: > > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: > > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: > > > > > > > But that matters to this patch because if MIPS can use > > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this > > > > > > > arch-specific hook any more and we can directly call > > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. > > > > > > > > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing > > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no > > > > > > clue as to the important of that code. > > > > > > > > > > As said above I think the x86 lockdep is really not necessary, then > > > > > considering MIPS could be the only one that will use the new hook > > > > > introduced in this patch... Shall we figure that out first? > > > > > > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a > > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, > > > > but then I realized x86 *has* a hook to do a precise remote TLB flush. > > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a > > > > memslot, i.e. this exact scenario. So arguably, x86 should be using the > > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). > > > > > > > > But, the hook is only used when KVM is running as an L1 on top of HyperV, > > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on > > > > HyperV? > > > > > > > > I see three options: > > > > > > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call > > > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to > > > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). > > > > > > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing > > > > a memslot after the dirty log is grabbed by userspace. > > > > > > > > 3. Keep the resulting code as is, but add a comment in x86's > > > > kvm_arch_dirty_log_tlb_flush() to explain why it uses > > > > kvm_flush_remote_tlbs() instead of the with_address() variant. > > > > > > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of > > > > those is preferable. > > > > > > > > I don't like (1) because (a) it requires more lines code (well comments), > > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would > > > > require even more comments, which would be x86-specific in generic KVM, > > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that > > > > info altogether. > > > > > > > > > > I proposed the 4th solution here: > > > > > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/ > > > > > > I'm not sure whether that's acceptable, but if it can, then we can > > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to > > > per-slot tlb flushing. > > > > This effectively is per-slot TLB flushing, it just has a different name. > > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot. > > I'm not opposed to that name change. And on second and third glance, I > > probably prefer it. That would more or less follow the naming of > > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot(). > > Note that the major point of the above patchset is not about doing tlb > flush per-memslot or globally. It's more about whether we can provide > a common entrance for TLB flushing. Say, after that series, we should > be able to flush TLB on all archs (majorly, including MIPS) as: > > kvm_flush_remote_tlbs(kvm); > > And with the same idea we can also introduce the ranged version. > > > > > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address() > > because that loses the important distinction (on x86) that slots_lock is > > expected to be held. > > Sorry I'm still puzzled on why that lockdep is so important and > special for x86... For example, what if we move that lockdep to the > callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects > even more arch (where we do get/clear dirty log)? IMHO the callers > must be with the slots_lock held anyways no matter for x86 or not. Following the breadcrumbs leads to the comment in kvm_mmu_slot_remove_write_access(), which says: /* * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log() * which do tlb flush out of mmu-lock should be serialized by * kvm->slots_lock otherwise tlb flush would be missed. */ I.e. write-protecting a memslot and grabbing the dirty log for the memslot need to be serialized. It's quite obvious *now* that get_dirty_log() holds slots_lock, but the purpose of lockdep assertions isn't just to verify the current functionality, it's to help ensure the correctness for future code and to document assumptions in the code. Digging deeper, there are four functions, all related to dirty logging, in the x86 mmu that basically open code what x86's kvm_arch_flush_remote_tlbs_memslot() would look like if it uses the range based flushing. Unless it's functionally incorrect (Vitaly?), going with option (2) and naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious choice, e.g. the final cleanup gives this diff stat: arch/x86/kvm/mmu/mmu.c | 34 +++++++++------------------------- 1 file changed, 9 insertions(+), 25 deletions(-)
Sean Christopherson <sean.j.christopherson@intel.com> writes: > +Vitaly for HyperV > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: >> On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: >> > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: >> > > But that matters to this patch because if MIPS can use >> > > kvm_flush_remote_tlbs(), then we probably don't need this >> > > arch-specific hook any more and we can directly call >> > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. >> > >> > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing >> > that prevents calling kvm_flush_remote_tlbs() directly, but I have no >> > clue as to the important of that code. >> >> As said above I think the x86 lockdep is really not necessary, then >> considering MIPS could be the only one that will use the new hook >> introduced in this patch... Shall we figure that out first? > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, > but then I realized x86 *has* a hook to do a precise remote TLB flush. > There's even an existing kvm_flush_remote_tlbs_with_address() call on a > memslot, i.e. this exact scenario. So arguably, x86 should be using the > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). > > But, the hook is only used when KVM is running as an L1 on top of HyperV, > and I assume dirty logging isn't used much, if at all, for L1 KVM on > HyperV? (Sorry for the delayed reply, was traveling last week) When KVM runs as an L1 on top of Hyper-V it uses eVMCS by default and eVMCSv1 doesn't support PML. I've also just checked Hyper-V 2019 and it hides SECONDARY_EXEC_ENABLE_PML from guests (this was expected). > > I see three options: > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing > a memslot after the dirty log is grabbed by userspace. > > 3. Keep the resulting code as is, but add a comment in x86's > kvm_arch_dirty_log_tlb_flush() to explain why it uses > kvm_flush_remote_tlbs() instead of the with_address() variant. > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of > those is preferable. I'd vote for (2): while this will effectively be kvm_flush_remote_tlbs() for now, we may think of something smarter in the future (e.g. PV interface for KVM-on-KVM). > > I don't like (1) because (a) it requires more lines code (well comments), > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would > require even more comments, which would be x86-specific in generic KVM, > to explain why x86 doesn't use its with_address() flush, or we'd lost that > info altogether. >
Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Fri, Feb 07, 2020 at 07:53:34PM -0500, Peter Xu wrote: >> On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote: >> > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote: >> > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote: >> > > > +Vitaly for HyperV >> > > > >> > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote: >> > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote: >> > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote: >> > > > > > > But that matters to this patch because if MIPS can use >> > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this >> > > > > > > arch-specific hook any more and we can directly call >> > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true. >> > > > > > >> > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing >> > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no >> > > > > > clue as to the important of that code. >> > > > > >> > > > > As said above I think the x86 lockdep is really not necessary, then >> > > > > considering MIPS could be the only one that will use the new hook >> > > > > introduced in this patch... Shall we figure that out first? >> > > > >> > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a >> > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86, >> > > > but then I realized x86 *has* a hook to do a precise remote TLB flush. >> > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a >> > > > memslot, i.e. this exact scenario. So arguably, x86 should be using the >> > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush(). >> > > > >> > > > But, the hook is only used when KVM is running as an L1 on top of HyperV, >> > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on >> > > > HyperV? >> > > > >> > > > I see three options: >> > > > >> > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call >> > > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to >> > > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush(). >> > > > >> > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing >> > > > a memslot after the dirty log is grabbed by userspace. >> > > > >> > > > 3. Keep the resulting code as is, but add a comment in x86's >> > > > kvm_arch_dirty_log_tlb_flush() to explain why it uses >> > > > kvm_flush_remote_tlbs() instead of the with_address() variant. >> > > > >> > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of >> > > > those is preferable. >> > > > >> > > > I don't like (1) because (a) it requires more lines code (well comments), >> > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would >> > > > require even more comments, which would be x86-specific in generic KVM, >> > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that >> > > > info altogether. >> > > > >> > > >> > > I proposed the 4th solution here: >> > > >> > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@redhat.com/ >> > > >> > > I'm not sure whether that's acceptable, but if it can, then we can >> > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to >> > > per-slot tlb flushing. >> > >> > This effectively is per-slot TLB flushing, it just has a different name. >> > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot. >> > I'm not opposed to that name change. And on second and third glance, I >> > probably prefer it. That would more or less follow the naming of >> > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot(). >> >> Note that the major point of the above patchset is not about doing tlb >> flush per-memslot or globally. It's more about whether we can provide >> a common entrance for TLB flushing. Say, after that series, we should >> be able to flush TLB on all archs (majorly, including MIPS) as: >> >> kvm_flush_remote_tlbs(kvm); >> >> And with the same idea we can also introduce the ranged version. >> >> > >> > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address() >> > because that loses the important distinction (on x86) that slots_lock is >> > expected to be held. >> >> Sorry I'm still puzzled on why that lockdep is so important and >> special for x86... For example, what if we move that lockdep to the >> callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects >> even more arch (where we do get/clear dirty log)? IMHO the callers >> must be with the slots_lock held anyways no matter for x86 or not. > > > Following the breadcrumbs leads to the comment in > kvm_mmu_slot_remove_write_access(), which says: > > /* > * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log() > * which do tlb flush out of mmu-lock should be serialized by > * kvm->slots_lock otherwise tlb flush would be missed. > */ > > I.e. write-protecting a memslot and grabbing the dirty log for the memslot > need to be serialized. It's quite obvious *now* that get_dirty_log() holds > slots_lock, but the purpose of lockdep assertions isn't just to verify the > current functionality, it's to help ensure the correctness for future code > and to document assumptions in the code. > > Digging deeper, there are four functions, all related to dirty logging, in > the x86 mmu that basically open code what x86's > kvm_arch_flush_remote_tlbs_memslot() would look like if it uses the range > based flushing. > > Unless it's functionally incorrect (Vitaly?), going with option (2) and > naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious > choice, e.g. the final cleanup gives this diff stat: (I apologize again for not replying in time) I think this is a valid approach and your option (2) would also be my choice. I also don't think there's going to be a problem when (if) Hyper-V adds support for PML (eVMCSv2?). > > arch/x86/kvm/mmu/mmu.c | 34 +++++++++------------------------- > 1 file changed, 9 insertions(+), 25 deletions(-) > Looks nice :-)
On Mon, Feb 17, 2020 at 04:39:39PM +0100, Vitaly Kuznetsov wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > Unless it's functionally incorrect (Vitaly?), going with option (2) and > > naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious > > choice, e.g. the final cleanup gives this diff stat: > > (I apologize again for not replying in time) No worries, didn't hinder me in the slightest as I was buried in other stuff last week anyways. > I think this is a valid approach and your option (2) would also be my > choice. I also don't think there's going to be a problem when (if) > Hyper-V adds support for PML (eVMCSv2?). Cool, thanks!
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 908f7ec3e755..8d2da949476f 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -962,69 +962,16 @@ long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, return r; } -/** - * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot - * @kvm: kvm instance - * @log: slot id and address to which we copy the log - * - * Steps 1-4 below provide general overview of dirty page logging. See - * kvm_get_dirty_log_protect() function description for additional details. - * - * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we - * always flush the TLB (step 4) even if previous step failed and the dirty - * bitmap may be corrupt. Regardless of previous outcome the KVM logging API - * does not preclude user space subsequent dirty log read. Flushing TLB ensures - * writes will be marked dirty for next log read. - * - * 1. Take a snapshot of the bit and clear it if needed. - * 2. Write protect the corresponding page. - * 3. Copy the snapshot to the userspace. - * 4. Flush TLB's if needed. - */ -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) { - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - bool flush = false; - int r; - mutex_lock(&kvm->slots_lock); - - r = kvm_get_dirty_log_protect(kvm, log, &flush); - - if (flush) { - slots = kvm_memslots(kvm); - memslot = id_to_memslot(slots, log->slot); - - /* Let implementation handle TLB/GVA invalidation */ - kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); - } - - mutex_unlock(&kvm->slots_lock); - return r; } -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, + struct kvm_memory_slot *memslot) { - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - bool flush = false; - int r; - - mutex_lock(&kvm->slots_lock); - - r = kvm_clear_dirty_log_protect(kvm, log, &flush); - - if (flush) { - slots = kvm_memslots(kvm); - memslot = id_to_memslot(slots, log->slot); - - /* Let implementation handle TLB/GVA invalidation */ - kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); - } - - mutex_unlock(&kvm->slots_lock); - return r; + /* Let implementation handle TLB/GVA invalidation */ + kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot); } long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 97ce6c4f7b48..0adaf4791a6d 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -799,6 +799,11 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu) return vcpu->kvm->arch.kvm_ops->check_requests(vcpu); } +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) +{ + +} + int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) { return kvm->arch.kvm_ops->get_dirty_log(kvm, log); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 24212b6ab03f..08b707a08357 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -1766,6 +1766,11 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu, return r; } +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) +{ + +} + int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) { return -ENOTSUPP; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 1bfbeac13a3b..dacc13bb4465 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -569,8 +569,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) return r; } -static void kvm_s390_sync_dirty_log(struct kvm *kvm, - struct kvm_memory_slot *memslot) +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) { int i; gfn_t cur_gfn, last_gfn; @@ -630,7 +629,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, if (!memslot->dirty_bitmap) goto out; - kvm_s390_sync_dirty_log(kvm, memslot); + kvm_arch_sync_dirty_log(kvm, memslot); r = kvm_get_dirty_log(kvm, log, &is_dirty); if (r) goto out; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fe914655b9d0..07f7d6458b89 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4737,77 +4737,24 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm, return 0; } -/** - * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot - * @kvm: kvm instance - * @log: slot id and address to which we copy the log - * - * Steps 1-4 below provide general overview of dirty page logging. See - * kvm_get_dirty_log_protect() function description for additional details. - * - * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we - * always flush the TLB (step 4) even if previous step failed and the dirty - * bitmap may be corrupt. Regardless of previous outcome the KVM logging API - * does not preclude user space subsequent dirty log read. Flushing TLB ensures - * writes will be marked dirty for next log read. - * - * 1. Take a snapshot of the bit and clear it if needed. - * 2. Write protect the corresponding page. - * 3. Copy the snapshot to the userspace. - * 4. Flush TLB's if needed. - */ -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) { - bool flush = false; - int r; - - mutex_lock(&kvm->slots_lock); - /* * Flush potentially hardware-cached dirty pages to dirty_bitmap. */ if (kvm_x86_ops->flush_log_dirty) kvm_x86_ops->flush_log_dirty(kvm); - - r = kvm_get_dirty_log_protect(kvm, log, &flush); - - /* - * All the TLBs can be flushed out of mmu lock, see the comments in - * kvm_mmu_slot_remove_write_access(). - */ - lockdep_assert_held(&kvm->slots_lock); - if (flush) - kvm_flush_remote_tlbs(kvm); - - mutex_unlock(&kvm->slots_lock); - return r; } -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, + struct kvm_memory_slot *memslot) { - bool flush = false; - int r; - - mutex_lock(&kvm->slots_lock); - - /* - * Flush potentially hardware-cached dirty pages to dirty_bitmap. - */ - if (kvm_x86_ops->flush_log_dirty) - kvm_x86_ops->flush_log_dirty(kvm); - - r = kvm_clear_dirty_log_protect(kvm, log, &flush); - /* * All the TLBs can be flushed out of mmu lock, see the comments in * kvm_mmu_slot_remove_write_access(). */ lockdep_assert_held(&kvm->slots_lock); - if (flush) - kvm_flush_remote_tlbs(kvm); - - mutex_unlock(&kvm->slots_lock); - return r; + kvm_flush_remote_tlbs(kvm); } int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event, diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index dac96f9c0a82..a56ba1a3c8d0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -811,23 +811,20 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf); int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext); -int kvm_get_dirty_log(struct kvm *kvm, - struct kvm_dirty_log *log, int *is_dirty); - -int kvm_get_dirty_log_protect(struct kvm *kvm, - struct kvm_dirty_log *log, bool *flush); -int kvm_clear_dirty_log_protect(struct kvm *kvm, - struct kvm_clear_dirty_log *log, bool *flush); - void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask); +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot); -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, - struct kvm_dirty_log *log); -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, - struct kvm_clear_dirty_log *log); +#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, + struct kvm_memory_slot *memslot); +#else /* !CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ +int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log); +int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log, + int *is_dirty); +#endif int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, bool line_status); diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index 3ff510599af6..2c9e1c12e53e 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -1185,55 +1185,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } -/** - * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot - * @kvm: kvm instance - * @log: slot id and address to which we copy the log - * - * Steps 1-4 below provide general overview of dirty page logging. See - * kvm_get_dirty_log_protect() function description for additional details. - * - * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we - * always flush the TLB (step 4) even if previous step failed and the dirty - * bitmap may be corrupt. Regardless of previous outcome the KVM logging API - * does not preclude user space subsequent dirty log read. Flushing TLB ensures - * writes will be marked dirty for next log read. - * - * 1. Take a snapshot of the bit and clear it if needed. - * 2. Write protect the corresponding page. - * 3. Copy the snapshot to the userspace. - * 4. Flush TLB's if needed. - */ -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) +void kvm_arch_sync_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot) { - bool flush = false; - int r; - mutex_lock(&kvm->slots_lock); - - r = kvm_get_dirty_log_protect(kvm, log, &flush); - - if (flush) - kvm_flush_remote_tlbs(kvm); - - mutex_unlock(&kvm->slots_lock); - return r; } -int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, struct kvm_clear_dirty_log *log) +void kvm_arch_dirty_log_tlb_flush(struct kvm *kvm, + struct kvm_memory_slot *memslot) { - bool flush = false; - int r; - - mutex_lock(&kvm->slots_lock); - - r = kvm_clear_dirty_log_protect(kvm, log, &flush); - - if (flush) - kvm_flush_remote_tlbs(kvm); - - mutex_unlock(&kvm->slots_lock); - return r; + kvm_flush_remote_tlbs(kvm); } static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 6210738cf2f6..60692588f0fc 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -855,7 +855,7 @@ static int kvm_vm_release(struct inode *inode, struct file *filp) /* * Allocation size is twice as large as the actual dirty bitmap size. - * See x86's kvm_vm_ioctl_get_dirty_log() why this is needed. + * See kvm_vm_ioctl_get_dirty_log() why this is needed. */ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot) { @@ -1197,6 +1197,7 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm, return kvm_set_memory_region(kvm, mem); } +#ifndef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log, int *is_dirty) { @@ -1230,13 +1231,12 @@ int kvm_get_dirty_log(struct kvm *kvm, } EXPORT_SYMBOL_GPL(kvm_get_dirty_log); -#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT +#else /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ /** * kvm_get_dirty_log_protect - get a snapshot of dirty pages * and reenable dirty page tracking for the corresponding pages. * @kvm: pointer to kvm instance * @log: slot id and address to which we copy the log - * @flush: true if TLB flush is needed by caller * * We need to keep it in mind that VCPU threads can write to the bitmap * concurrently. So, to avoid losing track of dirty pages we keep the @@ -1253,8 +1253,7 @@ EXPORT_SYMBOL_GPL(kvm_get_dirty_log); * exiting to userspace will be logged for the next call. * */ -int kvm_get_dirty_log_protect(struct kvm *kvm, - struct kvm_dirty_log *log, bool *flush) +static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) { struct kvm_memslots *slots; struct kvm_memory_slot *memslot; @@ -1262,6 +1261,7 @@ int kvm_get_dirty_log_protect(struct kvm *kvm, unsigned long n; unsigned long *dirty_bitmap; unsigned long *dirty_bitmap_buffer; + bool flush; as_id = log->slot >> 16; id = (u16)log->slot; @@ -1275,8 +1275,10 @@ int kvm_get_dirty_log_protect(struct kvm *kvm, if (!dirty_bitmap) return -ENOENT; + kvm_arch_sync_dirty_log(kvm, memslot); + n = kvm_dirty_bitmap_bytes(memslot); - *flush = false; + flush = false; if (kvm->manual_dirty_log_protect) { /* * Unlike kvm_get_dirty_log, we always return false in *flush, @@ -1299,7 +1301,7 @@ int kvm_get_dirty_log_protect(struct kvm *kvm, if (!dirty_bitmap[i]) continue; - *flush = true; + flush = true; mask = xchg(&dirty_bitmap[i], 0); dirty_bitmap_buffer[i] = mask; @@ -1310,21 +1312,55 @@ int kvm_get_dirty_log_protect(struct kvm *kvm, spin_unlock(&kvm->mmu_lock); } + if (flush) + kvm_arch_dirty_log_tlb_flush(kvm, memslot); + if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n)) return -EFAULT; return 0; } -EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect); + + +/** + * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot + * @kvm: kvm instance + * @log: slot id and address to which we copy the log + * + * Steps 1-4 below provide general overview of dirty page logging. See + * kvm_get_dirty_log_protect() function description for additional details. + * + * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we + * always flush the TLB (step 4) even if previous step failed and the dirty + * bitmap may be corrupt. Regardless of previous outcome the KVM logging API + * does not preclude user space subsequent dirty log read. Flushing TLB ensures + * writes will be marked dirty for next log read. + * + * 1. Take a snapshot of the bit and clear it if needed. + * 2. Write protect the corresponding page. + * 3. Copy the snapshot to the userspace. + * 4. Flush TLB's if needed. + */ +static int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, + struct kvm_dirty_log *log) +{ + int r; + + mutex_lock(&kvm->slots_lock); + + r = kvm_get_dirty_log_protect(kvm, log); + + mutex_unlock(&kvm->slots_lock); + return r; +} /** * kvm_clear_dirty_log_protect - clear dirty bits in the bitmap * and reenable dirty page tracking for the corresponding pages. * @kvm: pointer to kvm instance * @log: slot id and address from which to fetch the bitmap of dirty pages - * @flush: true if TLB flush is needed by caller */ -int kvm_clear_dirty_log_protect(struct kvm *kvm, - struct kvm_clear_dirty_log *log, bool *flush) +static int kvm_clear_dirty_log_protect(struct kvm *kvm, + struct kvm_clear_dirty_log *log) { struct kvm_memslots *slots; struct kvm_memory_slot *memslot; @@ -1333,6 +1369,7 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, unsigned long i, n; unsigned long *dirty_bitmap; unsigned long *dirty_bitmap_buffer; + bool flush; as_id = log->slot >> 16; id = (u16)log->slot; @@ -1356,7 +1393,9 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, (log->num_pages < memslot->npages - log->first_page && (log->num_pages & 63))) return -EINVAL; - *flush = false; + kvm_arch_sync_dirty_log(kvm, memslot); + + flush = false; dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot); if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n)) return -EFAULT; @@ -1379,17 +1418,32 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, * a problem if userspace sets them in log->dirty_bitmap. */ if (mask) { - *flush = true; + flush = true; kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); } } spin_unlock(&kvm->mmu_lock); + if (flush) + kvm_arch_dirty_log_tlb_flush(kvm, memslot); + return 0; } -EXPORT_SYMBOL_GPL(kvm_clear_dirty_log_protect); -#endif + +static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, + struct kvm_clear_dirty_log *log) +{ + int r; + + mutex_lock(&kvm->slots_lock); + + r = kvm_clear_dirty_log_protect(kvm, log); + + mutex_unlock(&kvm->slots_lock); + return r; +} +#endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ bool kvm_largepages_enabled(void) {