Message ID | 20110806103926.27198.11259.stgit@localhost6.localdomain6 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 08/06/2011 01:39 PM, Christoffer Dall wrote: > This commit introduces the framework for guest memory management > through the use of 2nd stage translation. Each VM has a pointer > to a level-1 tabled (the pgd field in struct kvm_arch) which is > used for the 2nd stage translations. Entries are added when handling > guest faults (later patch) and the table itself can be allocated and > freed through the following functions implemented in > arch/arm/kvm/arm_mmu.c: > - kvm_alloc_stage2_pgd(struct kvm *kvm); > - kvm_free_stage2_pgd(struct kvm *kvm); > > Further, each entry in TLBs and caches are tagged with a VMID > identifier in addition to ASIDs. The VMIDs are managed using > a bitmap and assigned when creating the VM in kvm_arch_init_vm() > where the 2nd stage pgd is also allocated. The table is freed in > kvm_arch_destroy_vm(). Both functions are called from the main > KVM code. > > > +/** > + * kvm_arch_init_vm - initializes a VM data structure > + * @kvm: pointer to the KVM struct > + */ > int kvm_arch_init_vm(struct kvm *kvm) > { > - return 0; > + int ret = 0; > + phys_addr_t pgd_phys; > + unsigned long vmid; > + unsigned long start, end; > + > + > + mutex_lock(&kvm_vmids_mutex); > + vmid = find_first_zero_bit(kvm_vmids, VMID_SIZE); > + if (vmid>= VMID_SIZE) { > + mutex_unlock(&kvm_vmids_mutex); > + return -EBUSY; > + } > + __set_bit(vmid, kvm_vmids); VMID_SIZE seems to be a bit low for comfort. I guess it's fine for a start, but later on we'll have to recycle VMIDs, like we do for SVM ASIDs. Is there not a risk of a user starting 255 tiny guests and denying other users the ability to use kvm? > + kvm->arch.vmid = vmid; > + mutex_unlock(&kvm_vmids_mutex); > + > + ret = kvm_alloc_stage2_pgd(kvm); > + if (ret) > + goto out_fail_alloc; > + > + pgd_phys = virt_to_phys(kvm->arch.pgd); > + kvm->arch.vttbr = pgd_phys& ((1LLU<< 40) - 1)& ~((2<< VTTBR_X) - 1); > + kvm->arch.vttbr |= ((u64)vmid<< 48); > + > + start = (unsigned long)kvm, > + end = start + sizeof(struct kvm); > + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); > + if (ret) > + goto out_fail_hyp_mappings; > + > + return ret; > +out_fail_hyp_mappings: > + remove_hyp_mappings(kvm_hyp_pgd, start, end); > +out_fail_alloc: > + clear_bit(vmid, kvm_vmids); > + return ret; > } >
On Aug 9, 2011, at 11:57 AM, Avi Kivity wrote: > On 08/06/2011 01:39 PM, Christoffer Dall wrote: >> This commit introduces the framework for guest memory management >> through the use of 2nd stage translation. Each VM has a pointer >> to a level-1 tabled (the pgd field in struct kvm_arch) which is >> used for the 2nd stage translations. Entries are added when handling >> guest faults (later patch) and the table itself can be allocated and >> freed through the following functions implemented in >> arch/arm/kvm/arm_mmu.c: >> - kvm_alloc_stage2_pgd(struct kvm *kvm); >> - kvm_free_stage2_pgd(struct kvm *kvm); >> >> Further, each entry in TLBs and caches are tagged with a VMID >> identifier in addition to ASIDs. The VMIDs are managed using >> a bitmap and assigned when creating the VM in kvm_arch_init_vm() >> where the 2nd stage pgd is also allocated. The table is freed in >> kvm_arch_destroy_vm(). Both functions are called from the main >> KVM code. >> >> >> +/** >> + * kvm_arch_init_vm - initializes a VM data structure >> + * @kvm: pointer to the KVM struct >> + */ >> int kvm_arch_init_vm(struct kvm *kvm) >> { >> - return 0; >> + int ret = 0; >> + phys_addr_t pgd_phys; >> + unsigned long vmid; >> + unsigned long start, end; >> + >> + >> + mutex_lock(&kvm_vmids_mutex); >> + vmid = find_first_zero_bit(kvm_vmids, VMID_SIZE); >> + if (vmid>= VMID_SIZE) { >> + mutex_unlock(&kvm_vmids_mutex); >> + return -EBUSY; >> + } >> + __set_bit(vmid, kvm_vmids); > > VMID_SIZE seems to be a bit low for comfort. I guess it's fine for a > start, but later on we'll have to recycle VMIDs, like we do for SVM ASIDs. > > Is there not a risk of a user starting 255 tiny guests and denying other > users the ability to use kvm? yes there absolutely is, if that's a valid use case. I wanted something simple for now, but completely agree that with ARM machines with TBs of memory, we need some more VMIDs. I will incorporate it into the next patch series. > >> + kvm->arch.vmid = vmid; >> + mutex_unlock(&kvm_vmids_mutex); >> + >> + ret = kvm_alloc_stage2_pgd(kvm); >> + if (ret) >> + goto out_fail_alloc; >> + >> + pgd_phys = virt_to_phys(kvm->arch.pgd); >> + kvm->arch.vttbr = pgd_phys& ((1LLU<< 40) - 1)& ~((2<< VTTBR_X) - 1); >> + kvm->arch.vttbr |= ((u64)vmid<< 48); >> + >> + start = (unsigned long)kvm, >> + end = start + sizeof(struct kvm); >> + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); >> + if (ret) >> + goto out_fail_hyp_mappings; >> + >> + return ret; >> +out_fail_hyp_mappings: >> + remove_hyp_mappings(kvm_hyp_pgd, start, end); >> +out_fail_alloc: >> + clear_bit(vmid, kvm_vmids); >> + return ret; >> } >> > > -- > error compiling committee.c: too many arguments to function > > _______________________________________________ > Android-virt mailing list > Android-virt@lists.cs.columbia.edu > https://lists.cs.columbia.edu/cucslists/listinfo/android-virt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 6a10467..06d1263 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -31,7 +31,9 @@ struct kvm_vcpu; u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode); struct kvm_arch { - pgd_t *pgd; /* 1-level 2nd stage table */ + u32 vmid; /* The VMID used for the virt. memory system */ + pgd_t *pgd; /* 1-level 2nd stage table */ + u64 vttbr; /* VTTBR value associated with above pgd and vmid */ }; #define EXCEPTION_NONE 0 diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index d22aad0..a64ab2d 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -37,4 +37,9 @@ void remove_hyp_mappings(pgd_t *hyp_pgd, unsigned long end); void free_hyp_pmds(pgd_t *hyp_pgd); +int kvm_alloc_stage2_pgd(struct kvm *kvm); +void kvm_free_stage2_pgd(struct kvm *kvm); + +int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run); + #endif /* __ARM_KVM_MMU_H__ */ diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index ccfb225..3db6794 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -72,15 +72,66 @@ void kvm_arch_sync_events(struct kvm *kvm) { } +/** + * kvm_arch_init_vm - initializes a VM data structure + * @kvm: pointer to the KVM struct + */ int kvm_arch_init_vm(struct kvm *kvm) { - return 0; + int ret = 0; + phys_addr_t pgd_phys; + unsigned long vmid; + unsigned long start, end; + + + mutex_lock(&kvm_vmids_mutex); + vmid = find_first_zero_bit(kvm_vmids, VMID_SIZE); + if (vmid >= VMID_SIZE) { + mutex_unlock(&kvm_vmids_mutex); + return -EBUSY; + } + __set_bit(vmid, kvm_vmids); + kvm->arch.vmid = vmid; + mutex_unlock(&kvm_vmids_mutex); + + ret = kvm_alloc_stage2_pgd(kvm); + if (ret) + goto out_fail_alloc; + + pgd_phys = virt_to_phys(kvm->arch.pgd); + kvm->arch.vttbr = pgd_phys & ((1LLU << 40) - 1) & ~((2 << VTTBR_X) - 1); + kvm->arch.vttbr |= ((u64)vmid << 48); + + start = (unsigned long)kvm, + end = start + sizeof(struct kvm); + ret = create_hyp_mappings(kvm_hyp_pgd, start, end); + if (ret) + goto out_fail_hyp_mappings; + + return ret; +out_fail_hyp_mappings: + remove_hyp_mappings(kvm_hyp_pgd, start, end); +out_fail_alloc: + clear_bit(vmid, kvm_vmids); + return ret; } +/** + * kvm_arch_destroy_vm - destroy the VM data structure + * @kvm: pointer to the KVM struct + */ void kvm_arch_destroy_vm(struct kvm *kvm) { int i; + kvm_free_stage2_pgd(kvm); + + if (kvm->arch.vmid != 0) { + mutex_lock(&kvm_vmids_mutex); + clear_bit(kvm->arch.vmid, kvm_vmids); + mutex_unlock(&kvm_vmids_mutex); + } + for (i = 0; i < KVM_MAX_VCPUS; ++i) { if (kvm->vcpus[i]) { kvm_arch_vcpu_free(kvm->vcpus[i]); @@ -154,6 +205,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) { int err; struct kvm_vcpu *vcpu; + unsigned long start, end; vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); if (!vcpu) { @@ -165,8 +217,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) if (err) goto free_vcpu; + start = (unsigned long)vcpu, + end = start + sizeof(struct kvm_vcpu); + err = create_hyp_mappings(kvm_hyp_pgd, start, end); + if (err) + goto out_fail_hyp_mappings; + latest_vcpu = vcpu; return vcpu; +out_fail_hyp_mappings: + remove_hyp_mappings(kvm_hyp_pgd, start, end); free_vcpu: kmem_cache_free(kvm_vcpu_cache, vcpu); out: diff --git a/arch/arm/kvm/arm_mmu.c b/arch/arm/kvm/arm_mmu.c index 8fefda2..5af0a7c 100644 --- a/arch/arm/kvm/arm_mmu.c +++ b/arch/arm/kvm/arm_mmu.c @@ -221,6 +221,75 @@ int create_hyp_mappings(pgd_t *hyp_pgd, unsigned long start, unsigned long end) return err; } +/** + * kvm_alloc_stage2_pgd - allocate level-1 table for stage-2 translation. + * @kvm: The KVM struct pointer for the VM. + * + * Allocates the 1st level table only of size defined by PGD2_ORDER (can + * support either full 40-bit input addresses or limited to 32-bit input + * addresses). Clears the allocated pages. + */ +int kvm_alloc_stage2_pgd(struct kvm *kvm) +{ + pgd_t *pgd; + + if (kvm->arch.pgd != NULL) { + kvm_err(-EINVAL, "kvm_arch already initialized?\n"); + return -EINVAL; + } + + pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, PGD2_ORDER); + if (!pgd) + return -ENOMEM; + + memset(pgd, 0, PTRS_PER_PGD2 * sizeof(pgd_t)); + kvm->arch.pgd = pgd; + + return 0; +} + +/** + * kvm_free_stage2_pgd - free all stage-2 tables + * @kvm: The KVM struct pointer for the VM. + * + * Walks the level-1 page table pointed to by kvm->arch.pgd and frees all + * underlying level-2 and level-3 tables before freeing the actual level-1 table + * and setting the struct pointer to NULL. + */ +void kvm_free_stage2_pgd(struct kvm *kvm) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + unsigned long long i, addr; + + if (kvm->arch.pgd == NULL) + return; + + /* + * We do this slightly different than other places, since we need more + * than 32 bits and for instance pgd_addr_end converts to unsigned long. + */ + addr = 0; + for (i = 0; i < PTRS_PER_PGD2; i++) { + addr = i * (unsigned long long)PGDIR_SIZE; + pgd = kvm->arch.pgd + i; + pud = pud_offset(pgd, addr); + + if (pud_none(*pud)) + continue; + + BUG_ON(pud_bad(*pud)); + + pmd = pmd_offset(pud, addr); + free_ptes(pmd, addr); + pmd_free(NULL, pmd); + } + + free_pages((unsigned long)kvm->arch.pgd, PGD2_ORDER); + kvm->arch.pgd = NULL; +} + int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run) { KVMARM_NOT_IMPLEMENTED();
This commit introduces the framework for guest memory management through the use of 2nd stage translation. Each VM has a pointer to a level-1 tabled (the pgd field in struct kvm_arch) which is used for the 2nd stage translations. Entries are added when handling guest faults (later patch) and the table itself can be allocated and freed through the following functions implemented in arch/arm/kvm/arm_mmu.c: - kvm_alloc_stage2_pgd(struct kvm *kvm); - kvm_free_stage2_pgd(struct kvm *kvm); Further, each entry in TLBs and caches are tagged with a VMID identifier in addition to ASIDs. The VMIDs are managed using a bitmap and assigned when creating the VM in kvm_arch_init_vm() where the 2nd stage pgd is also allocated. The table is freed in kvm_arch_destroy_vm(). Both functions are called from the main KVM code. Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> --- arch/arm/include/asm/kvm_host.h | 4 ++ arch/arm/include/asm/kvm_mmu.h | 5 +++ arch/arm/kvm/arm.c | 62 ++++++++++++++++++++++++++++++++++- arch/arm/kvm/arm_mmu.c | 69 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 138 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html