From patchwork Mon Dec 15 07:28:04 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Smarduch X-Patchwork-Id: 5491281 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 827F7BEEA8 for ; Mon, 15 Dec 2014 07:35:53 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 84340209EB for ; Mon, 15 Dec 2014 07:35:52 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6684C209EA for ; Mon, 15 Dec 2014 07:35:51 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Y0Q9X-0008Ex-5S; Mon, 15 Dec 2014 07:33:11 +0000 Received: from mailout2.w2.samsung.com ([211.189.100.12] helo=usmailout2.samsung.com) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Y0Q99-0007s9-2M for linux-arm-kernel@lists.infradead.org; Mon, 15 Dec 2014 07:32:52 +0000 Received: from uscpsbgex4.samsung.com (u125.gpu85.samsung.co.kr [203.254.195.125]) by mailout2.w2.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0NGM0060D4Y1BX20@mailout2.w2.samsung.com> for linux-arm-kernel@lists.infradead.org; Mon, 15 Dec 2014 02:32:25 -0500 (EST) X-AuditID: cbfec37d-b7f346d000007c38-e6-548e8e89c55f Received: from usmmp2.samsung.com ( [203.254.195.78]) by uscpsbgex4.samsung.com (USCPEXMTA) with SMTP id D2.49.31800.98E8E845; Mon, 15 Dec 2014 02:32:25 -0500 (EST) Received: from sisasmtp.sisa.samsung.com ([105.144.21.116]) by usmmp2.samsung.com (Oracle Communications Messaging Server 7u4-27.01(7.0.4.27.0) 64bit (built Aug 30 2012)) with ESMTP id <0NGM00KBW4Y14450@usmmp2.samsung.com>; Mon, 15 Dec 2014 02:32:25 -0500 (EST) Received: from mjsmard-530U3C-530U4C-532U3C.sisa.samsung.com (105.160.8.49) by SISAEX02SJ.sisa.samsung.com (105.144.21.116) with Microsoft SMTP Server (TLS) id 14.3.123.3; Sun, 14 Dec 2014 23:32:24 -0800 From: Mario Smarduch To: pbonzini@redhat.com, james.hogan@imgtec.com, christoffer.dall@linaro.org, agraf@suse.de, marc.zyngier@arm.com, cornelia.huck@de.ibm.com, borntraeger@de.ibm.com, catalin.marinas@arm.com Subject: [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling Date: Sun, 14 Dec 2014 23:28:04 -0800 Message-id: <1418628488-3696-8-git-send-email-m.smarduch@samsung.com> X-Mailer: git-send-email 1.7.9.5 In-reply-to: <1418628488-3696-1-git-send-email-m.smarduch@samsung.com> References: <1418628488-3696-1-git-send-email-m.smarduch@samsung.com> MIME-version: 1.0 X-Originating-IP: [105.160.8.49] X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpkkeLIzCtJLcpLzFFi42I5/e+wn25nX1+Iwb1PRhYnrvxjtJi+YjuL xftlPYwWL14DufObGxkt3s17wWzR/ayZ0eLNJ22LOVMLLT6eOs5usenxNVaLv3f+sVns3/aP 1WLOmQcsFpPebGNy4PdYM28No8fBR4fYPHp2nmH0uHNtD5vH+U1rmD02L6n3eL/vKpvH5tPV Hp83yQVwRnHZpKTmZJalFunbJXBlfDn1jbHggmHFils97A2MLRpdjJwcEgImEjcvTGSBsMUk Ltxbz9bFyMUhJLCMUWLJkk5mCKeXSeLIhWWMEM55RonzT8+CtbAJ6Ersv7eRHSQhInCAUeLE xl9MIA6zwFtGiR0n/zCBVAkLeEnsvbCcHcRmEVCVeL1iIVg3r4CrxKJNn4HGcgAtV5CYM8kG xOQUcJPYu1sZpEIIqOLvhvtMENWCEj8m32MBKWEWkJB4/lkJokRVYtvN53BDNi7wmcAoNAtJ wyyEhgWMTKsYxUqLkwuKk9JTK0z0ihNzi0vz0vWS83M3MUKirXYH4/2vNocYBTgYlXh4Ixj7 QoRYE8uKK3MPMUpwMCuJ8HbHA4V4UxIrq1KL8uOLSnNSiw8xMnFwSjUwnsz+rWQ0TTj84gWm envWq4mfbqRmzk6uu1ZuX5h38OqNzVV21+1P33o7U22fwr83tbM/q2k3p4Wv1Cj6eYdjRiHr Gk4f/VO3g1OiRTb+N7/r7H/cqy81o7H8XNlsp+rfL4QNX3v9/BRbJuCoLf793LnprNEt25V/ zT6S4tCdZLBk5czEuZfFlViKMxINtZiLihMBYXBVepQCAAA= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20141214_233247_268217_C814A0D9 X-CRM114-Status: GOOD ( 16.24 ) X-Spam-Score: -5.0 (-----) Cc: peter.maydell@linaro.org, kvm@vger.kernel.org, steve.capper@arm.com, kvm-ia64@vger.kernel.org, kvm-ppc@vger.kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, Mario Smarduch X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds support for handling 2nd stage page faults during migration, it disables faulting in huge pages, and dissolves huge pages to page tables. In case migration is canceled huge pages are used again. Also since last version an issues was found on SMP host running SMP Guest and clearing huge TLB entry. Multiple CPUs can write to same huge page range so all pages in the range are marked dirty after the TLB is flushed. It didn't showup on hardware, but appeared on Fast Models perhpas the TLB flush is slower. To prevent clutter in user_mem_abort() refactored some code into functions. Signed-off-by: Mario Smarduch --- arch/arm/kvm/mmu.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 78 insertions(+), 8 deletions(-) diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 73d506f..dc763bb 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector; #define kvm_pmd_huge(_x) (pmd_huge(_x) || pmd_trans_huge(_x)) #define kvm_pud_huge(_x) pud_huge(_x) +#define KVM_S2PTE_FLAG_IS_IOMAP (1UL << 0) +#define KVM_S2PTE_FLAG_LOGGING_ACTIVE (1UL << 1) + +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot) +{ +#ifdef CONFIG_ARM + return !!memslot->dirty_bitmap; +#else + return false; +#endif +} + static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { /* @@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa); } +/** + * stage2_dissolve_pmd() - clear and flush huge PMD entry + * @kvm: pointer to kvm structure. + * @addr IPA + * @pmd pmd pointer for IPA + * + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all + * pages in the range dirty. + */ +void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd) +{ + gfn_t gfn; + int i; + + if (kvm_pmd_huge(*pmd)) { + pmd_clear(pmd); + kvm_tlb_flush_vmid_ipa(kvm, addr); + put_page(virt_to_page(pmd)); +#ifdef CONFIG_SMP + gfn = (addr & PMD_MASK) >> PAGE_SHIFT; + + /* + * Mark all pages in PMD range dirty, in case other CPUs are + * writing to it. + */ + for (i = 0; i < PTRS_PER_PMD; i++) + mark_page_dirty(kvm, gfn + i); +#endif + } +} + static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache, int min, int max) { @@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache } static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, - phys_addr_t addr, const pte_t *new_pte, bool iomap) + phys_addr_t addr, const pte_t *new_pte, + unsigned long flags) { pmd_t *pmd; pte_t *pte, old_pte; + unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP; + unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE; /* Create stage-2 page table mapping - Levels 0 and 1 */ pmd = stage2_get_pmd(kvm, cache, addr); @@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, return 0; } + /* + * While dirty page logging - dissolve huge PMD, then continue on to + * allocate page. + */ + if (logging_active) + stage2_dissolve_pmd(kvm, addr, pmd); + /* Create stage-2 page mappings - Level 2 */ if (pmd_none(*pmd)) { if (!cache) @@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, if (ret) goto out; spin_lock(&kvm->mmu_lock); - ret = stage2_set_pte(kvm, &cache, addr, &pte, true); + ret = stage2_set_pte(kvm, &cache, addr, &pte, + KVM_S2PTE_FLAG_IS_IOMAP); spin_unlock(&kvm->mmu_lock); if (ret) goto out; @@ -1002,6 +1056,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, pfn_t pfn; pgprot_t mem_type = PAGE_S2; bool fault_ipa_uncached; + unsigned long logging_active = 0; + + if (kvm_get_logging_state(memslot)) + logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE; write_fault = kvm_is_write_fault(vcpu); if (fault_status == FSC_PERM && !write_fault) { @@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - if (is_vm_hugetlb_page(vma)) { + if (is_vm_hugetlb_page(vma) && !logging_active) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; } else { @@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, spin_lock(&kvm->mmu_lock); if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; - if (!hugetlb && !force_pte) + if (!hugetlb && !force_pte && !logging_active) hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa); fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT; @@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd); } else { pte_t new_pte = pfn_pte(pfn, mem_type); + unsigned long flags = logging_active; + + if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE)) + flags |= KVM_S2PTE_FLAG_IS_IOMAP; + if (writable) { kvm_set_s2pte_writable(&new_pte); kvm_set_pfn_dirty(pfn); } coherent_cache_guest_page(vcpu, hva, PAGE_SIZE, fault_ipa_uncached); - ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, - pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE)); + ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags); } - + if (write_fault) + mark_page_dirty(kvm, gfn); out_unlock: spin_unlock(&kvm->mmu_lock); kvm_release_pfn_clean(pfn); @@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data) { pte_t *pte = (pte_t *)data; - stage2_set_pte(kvm, NULL, gpa, pte, false); + /* + * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE + * flag set because MMU notifiers will have unmapped a huge PMD before + * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and + * therefore stage2_set_pte() never needs to clear out a huge PMD + * through this calling path. + */ + stage2_set_pte(kvm, NULL, gpa, pte, 0); }