From patchwork Thu Jan 15 23:58:58 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Smarduch X-Patchwork-Id: 5643611 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 8359DC058D for ; Fri, 16 Jan 2015 00:07:14 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 780F82015E for ; Fri, 16 Jan 2015 00:07:13 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6413A201FA for ; Fri, 16 Jan 2015 00:07:12 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YBuPA-0003sO-QU; Fri, 16 Jan 2015 00:04:48 +0000 Received: from mailout3.w2.samsung.com ([211.189.100.13] helo=usmailout3.samsung.com) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YBuNg-0002jd-1L for linux-arm-kernel@lists.infradead.org; Fri, 16 Jan 2015 00:03:17 +0000 Received: from uscpsbgex2.samsung.com (u123.gpu85.samsung.co.kr [203.254.195.123]) by usmailout3.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0NI800FYETGU5220@usmailout3.samsung.com> for linux-arm-kernel@lists.infradead.org; Thu, 15 Jan 2015 19:02:54 -0500 (EST) X-AuditID: cbfec37b-b7f836d000001951-5a-54b8552e4231 Received: from usmmp1.samsung.com ( [203.254.195.77]) by uscpsbgex2.samsung.com (USCPEXMTA) with SMTP id 23.54.06481.E2558B45; Thu, 15 Jan 2015 19:02:54 -0500 (EST) Received: from sisasmtp.sisa.samsung.com ([105.144.21.116]) by usmmp1.samsung.com (Oracle Communications Messaging Server 7u4-27.01(7.0.4.27.0) 64bit (built Aug 30 2012)) with ESMTP id <0NI800036TGTTLB0@usmmp1.samsung.com>; Thu, 15 Jan 2015 19:02:54 -0500 (EST) Received: from mjsmard-530U3C-530U4C-532U3C.sisa.samsung.com (105.144.129.79) by SISAEX02SJ.sisa.samsung.com (105.144.21.116) with Microsoft SMTP Server (TLS) id 14.3.123.3; Thu, 15 Jan 2015 16:02:53 -0800 From: Mario Smarduch To: christoffer.dall@linaro.org, marc.zyngier@arm.com, pbonzini@redhat.com, james.hogan@imgtec.com, agraf@suse.de, cornelia.huck@de.ibm.com, borntraeger@de.ibm.com Subject: [PATCH v16 07/10] KVM: arm: page logging 2nd stage fault handling Date: Thu, 15 Jan 2015 15:58:58 -0800 Message-id: <1421366341-26012-8-git-send-email-m.smarduch@samsung.com> X-Mailer: git-send-email 1.7.9.5 In-reply-to: <1421366341-26012-1-git-send-email-m.smarduch@samsung.com> References: <1421366341-26012-1-git-send-email-m.smarduch@samsung.com> MIME-version: 1.0 X-Originating-IP: [105.144.129.79] X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpikeLIzCtJLcpLzFFi42I5/e+wr65e6I4Qgy8zhC1OXPnHaDF9xXYW i/fLehgtXrwGcuc3NzJavJv3gtmi+1kzo8WbT9oWc6YWWnw8dZzdYtPja6wWf+/8Y7PYv+0f q8WcMw9YLCa92cbkwO+xZt4aRo+Djw6xefTsPMPocefaHjaP85vWMHtsXlLv8X7fVTaPzaer PT5vkgvgjOKySUnNySxLLdK3S+DKONMrUvDQvOLmribGBsYTOl2MnBwSAiYSd05sZoSwxSQu 3FvP1sXIxSEksIxR4sKSdiYIp5dJ4tvBFhYI5yKjxOYrHawgLWwCuhL7721kB0mICCxnlDi9 5TdYFbNAG5PE5yPLmUCqhAW8JL5sf8EOYrMIqErc/74XzOYVcJNYO3UJcxcjB9ByBYk5k2xA wpwC7hITZzSDLRACKlky4yUbRLmgxI/J91hAypkFJCSef1aCKFGV2HbzOdQLShLTDl9ln8Ao NAtJxyyEjgWMTKsYxUqLkwuKk9JTK4z0ihNzi0vz0vWS83M3MULirXoH492vNocYBTgYlXh4 Gfy2hwixJpYVV+YeYpTgYFYS4eVl3hEixJuSWFmVWpQfX1Sak1p8iJGJg1OqgdHozcQlbrHu D4zu3KiedS+h9+FTxtPfOyc8umd7PkNnmfiMuVvunH4U67CLz6Xg+u0Fyd8E3molTmb/XXY0 8Ho/7+dVTzqMf69sVl+Rs6pk6ieXInPv0PdP7yRy7P9iVSDc/HqdHV+FiEXst1l7PbVXhG+0 zfSULtKapmzwdqfb0e6wzH/Lc7SUWIozEg21mIuKEwH8FBjwlQIAAA== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150115_160316_200423_7136364C X-CRM114-Status: GOOD ( 15.40 ) X-Spam-Score: -5.0 (-----) Cc: peter.maydell@linaro.org, kvm@vger.kernel.org, steve.capper@arm.com, kvm-ia64@vger.kernel.org, catalin.marinas@arm.com, kvm-ppc@vger.kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, Mario Smarduch X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds support for 2nd stage page fault handling while dirty page logging. On huge page faults, huge pages are dissolved to normal pages, and rebuilding of 2nd stage huge pages is blocked. In case migration is canceled this restriction is removed and huge pages may be rebuilt again. Signed-off-by: Mario Smarduch --- arch/arm/kvm/mmu.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 88 insertions(+), 9 deletions(-) diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 73d506f..ea6b13e 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector; #define kvm_pmd_huge(_x) (pmd_huge(_x) || pmd_trans_huge(_x)) #define kvm_pud_huge(_x) pud_huge(_x) +#define KVM_S2PTE_FLAG_IS_IOMAP (1UL << 0) +#define KVM_S2_FLAG_LOGGING_ACTIVE (1UL << 1) + +static bool memslot_is_logging(struct kvm_memory_slot *memslot) +{ +#ifdef CONFIG_ARM + return memslot->dirty_bitmap && !(memslot->flags & KVM_MEM_READONLY); +#else + return false; +#endif +} + static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { /* @@ -59,6 +71,25 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa); } +/** + * stage2_dissolve_pmd() - clear and flush huge PMD entry + * @kvm: pointer to kvm structure. + * @addr: IPA + * @pmd: pmd pointer for IPA + * + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all + * pages in the range dirty. + */ +static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd) +{ + if (!kvm_pmd_huge(*pmd)) + return; + + pmd_clear(pmd); + kvm_tlb_flush_vmid_ipa(kvm, addr); + put_page(virt_to_page(pmd)); +} + static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache, int min, int max) { @@ -703,10 +734,15 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache } static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, - phys_addr_t addr, const pte_t *new_pte, bool iomap) + phys_addr_t addr, const pte_t *new_pte, + unsigned long flags) { pmd_t *pmd; pte_t *pte, old_pte; + bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP; + bool logging_active = flags & KVM_S2_FLAG_LOGGING_ACTIVE; + + VM_BUG_ON(logging_active && !cache); /* Create stage-2 page table mapping - Levels 0 and 1 */ pmd = stage2_get_pmd(kvm, cache, addr); @@ -718,6 +754,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, return 0; } + /* + * While dirty page logging - dissolve huge PMD, then continue on to + * allocate page. + */ + if (logging_active) + stage2_dissolve_pmd(kvm, addr, pmd); + /* Create stage-2 page mappings - Level 2 */ if (pmd_none(*pmd)) { if (!cache) @@ -774,7 +817,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, if (ret) goto out; spin_lock(&kvm->mmu_lock); - ret = stage2_set_pte(kvm, &cache, addr, &pte, true); + ret = stage2_set_pte(kvm, &cache, addr, &pte, + KVM_S2PTE_FLAG_IS_IOMAP); spin_unlock(&kvm->mmu_lock); if (ret) goto out; @@ -1002,6 +1046,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, pfn_t pfn; pgprot_t mem_type = PAGE_S2; bool fault_ipa_uncached; + bool logging_active = memslot_is_logging(memslot); + unsigned long flags = 0; write_fault = kvm_is_write_fault(vcpu); if (fault_status == FSC_PERM && !write_fault) { @@ -1018,7 +1064,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - if (is_vm_hugetlb_page(vma)) { + if (is_vm_hugetlb_page(vma) && !logging_active) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; } else { @@ -1059,12 +1105,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (is_error_pfn(pfn)) return -EFAULT; - if (kvm_is_device_pfn(pfn)) + if (kvm_is_device_pfn(pfn)) { mem_type = PAGE_S2_DEVICE; + flags |= KVM_S2PTE_FLAG_IS_IOMAP; + } else if (logging_active) { + /* + * Faults on pages in a memslot with logging enabled + * should not be mapped with huge pages (it introduces churn + * and performance degradation), so force a pte mapping. + */ + force_pte = true; + flags |= KVM_S2_FLAG_LOGGING_ACTIVE; + + /* + * Only actually map the page as writable if this was a write + * fault. + */ + if (!write_fault) + writable = false; + } spin_lock(&kvm->mmu_lock); if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; + if (!hugetlb && !force_pte) hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa); @@ -1082,17 +1146,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd); } else { pte_t new_pte = pfn_pte(pfn, mem_type); + if (writable) { kvm_set_s2pte_writable(&new_pte); kvm_set_pfn_dirty(pfn); + mark_page_dirty(kvm, gfn); } coherent_cache_guest_page(vcpu, hva, PAGE_SIZE, fault_ipa_uncached); - ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, - pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE)); + ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags); } - out_unlock: spin_unlock(&kvm->mmu_lock); kvm_release_pfn_clean(pfn); @@ -1242,7 +1306,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data) { pte_t *pte = (pte_t *)data; - stage2_set_pte(kvm, NULL, gpa, pte, false); + /* + * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE + * flag clear because MMU notifiers will have unmapped a huge PMD before + * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and + * therefore stage2_set_pte() never needs to clear out a huge PMD + * through this calling path. + */ + stage2_set_pte(kvm, NULL, gpa, pte, 0); } @@ -1396,7 +1467,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, bool writable = !(mem->flags & KVM_MEM_READONLY); int ret = 0; - if (change != KVM_MR_CREATE && change != KVM_MR_MOVE) + if (change != KVM_MR_CREATE && change != KVM_MR_MOVE && + change != KVM_MR_FLAGS_ONLY) return 0; /* @@ -1447,6 +1519,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + vm_start - vma->vm_start; + /* IO region dirty page logging not allowed */ + if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) + return -EINVAL; + ret = kvm_phys_addr_ioremap(kvm, gpa, pa, vm_end - vm_start, writable); @@ -1456,6 +1532,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, hva = vm_end; } while (hva < reg_end); + if (change == KVM_MR_FLAGS_ONLY) + return ret; + spin_lock(&kvm->mmu_lock); if (ret) unmap_stage2_range(kvm, mem->guest_phys_addr, mem->memory_size);