Message ID | 20110603150346.17011.70545.stgit@ubuntu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/03/2011 06:03 PM, Christoffer Dall wrote: > Initializes a blank level-1 translation table for the second stage > translation and handles freeing it as well. > > + start = (unsigned long)kvm, > + end = start + sizeof(struct kvm); > + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); Why not map all GFP_KERNEL memory?
On Sun, Jun 5, 2011 at 2:41 PM, Avi Kivity <avi@redhat.com> wrote: > On 06/03/2011 06:03 PM, Christoffer Dall wrote: >> >> Initializes a blank level-1 translation table for the second stage >> translation and handles freeing it as well. >> >> + start = (unsigned long)kvm, >> + end = start + sizeof(struct kvm); >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); > > Why not map all GFP_KERNEL memory? > I wanted to only map things I was sure would be there and stay there so no assumptions were made about existing pages which could have been removed, since I don't handle aborts taken in the hypervisor itself. But, if it would be as safe to map all GFP_KERNEL memory and that also maps the necessary code segments, then we could do that. Do you feel it would me simpler/faster/easier? > -- > error compiling committee.c: too many arguments to function > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/05/2011 05:50 PM, Christoffer Dall wrote: > On Sun, Jun 5, 2011 at 2:41 PM, Avi Kivity<avi@redhat.com> wrote: > > On 06/03/2011 06:03 PM, Christoffer Dall wrote: > >> > >> Initializes a blank level-1 translation table for the second stage > >> translation and handles freeing it as well. > >> > >> + start = (unsigned long)kvm, > >> + end = start + sizeof(struct kvm); > >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); > > > > Why not map all GFP_KERNEL memory? > > > I wanted to only map things I was sure would be there and stay there > so no assumptions were made about existing pages which could have been > removed, since I don't handle aborts taken in the hypervisor itself. > But, if it would be as safe to map all GFP_KERNEL memory and that also > maps the necessary code segments, then we could do that. Do you feel > it would me simpler/faster/easier? I think so - you wouldn't have to worry about dereferencing pointers within the vcpu structure. Of course, it's up to you, I don't really have a enough understanding of the architecture.
On 06/05/2011 05:53 PM, Avi Kivity wrote: > On 06/05/2011 05:50 PM, Christoffer Dall wrote: >> On Sun, Jun 5, 2011 at 2:41 PM, Avi Kivity<avi@redhat.com> wrote: >> > On 06/03/2011 06:03 PM, Christoffer Dall wrote: >> >> >> >> Initializes a blank level-1 translation table for the second stage >> >> translation and handles freeing it as well. >> >> >> >> + start = (unsigned long)kvm, >> >> + end = start + sizeof(struct kvm); >> >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); >> > >> > Why not map all GFP_KERNEL memory? >> > >> I wanted to only map things I was sure would be there and stay there >> so no assumptions were made about existing pages which could have been >> removed, since I don't handle aborts taken in the hypervisor itself. >> But, if it would be as safe to map all GFP_KERNEL memory and that also >> maps the necessary code segments, then we could do that. Do you feel >> it would me simpler/faster/easier? > > I think so - you wouldn't have to worry about dereferencing pointers > within the vcpu structure. Also, you could use huge pages for the mapping, yes? that should improve switching performance a bit. Can you run the host kernel in hypervisor mode? That may reduce switching time even further.
On Sun, Jun 5, 2011 at 5:14 PM, Avi Kivity <avi@redhat.com> wrote: > On 06/05/2011 05:53 PM, Avi Kivity wrote: >> >> On 06/05/2011 05:50 PM, Christoffer Dall wrote: >>> >>> On Sun, Jun 5, 2011 at 2:41 PM, Avi Kivity<avi@redhat.com> wrote: >>> > On 06/03/2011 06:03 PM, Christoffer Dall wrote: >>> >> >>> >> Initializes a blank level-1 translation table for the second stage >>> >> translation and handles freeing it as well. >>> >> >>> >> + start = (unsigned long)kvm, >>> >> + end = start + sizeof(struct kvm); >>> >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); >>> > >>> > Why not map all GFP_KERNEL memory? >>> > >>> I wanted to only map things I was sure would be there and stay there >>> so no assumptions were made about existing pages which could have been >>> removed, since I don't handle aborts taken in the hypervisor itself. >>> But, if it would be as safe to map all GFP_KERNEL memory and that also >>> maps the necessary code segments, then we could do that. Do you feel >>> it would me simpler/faster/easier? >> >> I think so - you wouldn't have to worry about dereferencing pointers >> within the vcpu structure. > > Also, you could use huge pages for the mapping, yes? that should improve > switching performance a bit. well, the only advantage here would be to save a few entries in the TLB right? So really it would only be the case if the data and the code and such accessed during switches lie within the same sections, which could happen, but could also not happen. I don't see a big performance gain here, but slightly more complicated code. For instance, if the VCPU struct is mapped using a page, not a section mapping, I cannot use section mappings since I would map weird things. So I would have to support both for allocating and freeing. I suggest keeping this in place for now and experiment with performance later on to see if there's a gain. Allocating all of GFP_KERNEL memory could be good to prevent bugs, but I would like to clear this with the ARM memory experts that it is in fact a good idea. > > Can you run the host kernel in hypervisor mode? That may reduce switching > time even further. No, I think the implications would be way too widespread all over the kernel. > > -- > error compiling committee.c: too many arguments to function > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/05/2011 06:27 PM, Christoffer Dall wrote: > On Sun, Jun 5, 2011 at 5:14 PM, Avi Kivity<avi@redhat.com> wrote: > > On 06/05/2011 05:53 PM, Avi Kivity wrote: > >> > >> On 06/05/2011 05:50 PM, Christoffer Dall wrote: > >>> > >>> On Sun, Jun 5, 2011 at 2:41 PM, Avi Kivity<avi@redhat.com> wrote: > >>> > On 06/03/2011 06:03 PM, Christoffer Dall wrote: > >>> >> > >>> >> Initializes a blank level-1 translation table for the second stage > >>> >> translation and handles freeing it as well. > >>> >> > >>> >> + start = (unsigned long)kvm, > >>> >> + end = start + sizeof(struct kvm); > >>> >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); > >>> > > >>> > Why not map all GFP_KERNEL memory? > >>> > > >>> I wanted to only map things I was sure would be there and stay there > >>> so no assumptions were made about existing pages which could have been > >>> removed, since I don't handle aborts taken in the hypervisor itself. > >>> But, if it would be as safe to map all GFP_KERNEL memory and that also > >>> maps the necessary code segments, then we could do that. Do you feel > >>> it would me simpler/faster/easier? > >> > >> I think so - you wouldn't have to worry about dereferencing pointers > >> within the vcpu structure. > > > > Also, you could use huge pages for the mapping, yes? that should improve > > switching performance a bit. > > well, the only advantage here would be to save a few entries in the > TLB right? So really it would only be the case if the data and the > code and such accessed during switches lie within the same sections, > which could happen, but could also not happen. I don't see a big > performance gain here, but slightly more complicated code. For > instance, if the VCPU struct is mapped using a page, not a section > mapping, I cannot use section mappings since I would map weird things. > So I would have to support both for allocating and freeing. > > I suggest keeping this in place for now and experiment with > performance later on to see if there's a gain. Allocating all of > GFP_KERNEL memory could be good to prevent bugs, but I would like to > clear this with the ARM memory experts that it is in fact a good idea. Sure. All of my arch related comments are made from ignorance anyway, feel free to ignore or use them as you like. The only important ones are those related to the API. > > > > Can you run the host kernel in hypervisor mode? That may reduce switching > > time even further. > > No, I think the implications would be way too widespread all over the kernel. Okay.
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 9fa9b20..5955ff4 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -31,7 +31,9 @@ struct kvm_vcpu; u32* kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode); struct kvm_arch { - pgd_t *pgd; /* 1-level 2nd stage table */ + u32 vmid; /* The VMID used for the virt. memory system */ + pgd_t *pgd; /* 1-level 2nd stage table */ + u64 vttbr; /* VTTBR value associated with above pgd and vmid */ }; #define EXCEPTION_NONE 0 diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index d22aad0..a64ab2d 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -37,4 +37,9 @@ void remove_hyp_mappings(pgd_t *hyp_pgd, unsigned long end); void free_hyp_pmds(pgd_t *hyp_pgd); +int kvm_alloc_stage2_pgd(struct kvm *kvm); +void kvm_free_stage2_pgd(struct kvm *kvm); + +int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run); + #endif /* __ARM_KVM_MMU_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 4f691be..714f415 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -77,13 +77,56 @@ void kvm_arch_sync_events(struct kvm *kvm) int kvm_arch_init_vm(struct kvm *kvm) { - return 0; + int ret = 0; + phys_addr_t pgd_phys; + unsigned long vmid; + unsigned long start, end; + + + mutex_lock(&kvm_vmids_mutex); + vmid = find_first_zero_bit(kvm_vmids, VMID_SIZE); + if (vmid >= VMID_SIZE) { + mutex_unlock(&kvm_vmids_mutex); + return -EBUSY; + } + __set_bit(vmid, kvm_vmids); + kvm->arch.vmid = vmid; + mutex_unlock(&kvm_vmids_mutex); + + ret = kvm_alloc_stage2_pgd(kvm); + if (ret) + goto out_fail_alloc; + + pgd_phys = virt_to_phys(kvm->arch.pgd); + kvm->arch.vttbr = (pgd_phys & ((1LLU << 40) - 1) & ~((2 << VTTBR_X) - 1)) | + ((u64)vmid << 48); + + start = (unsigned long)kvm, + end = start + sizeof(struct kvm); + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); + if (ret) + goto out_fail_hyp_mappings; + + return ret; +out_fail_hyp_mappings: + remove_hyp_mappings(kvm_hyp_pgd, start, end); +out_fail_alloc: + clear_bit(vmid, kvm_vmids); + return ret; } void kvm_arch_destroy_vm(struct kvm *kvm) { int i; + kvm_free_stage2_pgd(kvm); + + if (kvm->arch.vmid != 0) { + mutex_lock(&kvm_vmids_mutex); + clear_bit(kvm->arch.vmid, kvm_vmids); + mutex_unlock(&kvm_vmids_mutex); + } + for (i = 0; i < KVM_MAX_VCPUS; ++i) { if (kvm->vcpus[i]) { kvm_arch_vcpu_free(kvm->vcpus[i]); @@ -158,6 +201,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) { int err; struct kvm_vcpu *vcpu; + unsigned long start, end; vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); if (!vcpu) { @@ -169,7 +213,15 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) if (err) goto free_vcpu; + start = (unsigned long)vcpu, + end = start + sizeof(struct kvm_vcpu); + err = create_hyp_mappings(kvm_hyp_pgd, start, end); + if (err) + goto out_fail_hyp_mappings; + return vcpu; +out_fail_hyp_mappings: + remove_hyp_mappings(kvm_hyp_pgd, start, end); free_vcpu: kmem_cache_free(kvm_vcpu_cache, vcpu); out: