From patchwork Fri Oct 4 16:13:21 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoffer Dall X-Patchwork-Id: 2989161 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 07F329F245 for ; Fri, 4 Oct 2013 16:15:15 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DA775202F9 for ; Fri, 4 Oct 2013 16:15:13 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8C1212021C for ; Fri, 4 Oct 2013 16:15:12 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VS81J-0001ar-PM; Fri, 04 Oct 2013 16:14:26 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VS817-00009j-SI; Fri, 04 Oct 2013 16:14:14 +0000 Received: from mail-ie0-f182.google.com ([209.85.223.182]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VS80v-00007F-CQ for linux-arm-kernel@lists.infradead.org; Fri, 04 Oct 2013 16:14:03 +0000 Received: by mail-ie0-f182.google.com with SMTP id aq17so9548832iec.41 for ; Fri, 04 Oct 2013 09:13:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=koZho5U3fSou/WeDss+ruLafQvlS3WmZQkrhRls3umE=; b=EYT9aKlYF5nODVP0L1oJVIscNPKG2q0fF6JdGn0oOUYArOU4j+4exFL0rIrk0DakuP +9pvmZZv4u5+TtZftHb9ZirzsfIMCpktbc4Zm+q/5UrQUUtSkRpA5rysh/SDcVaEQadm hpSuYPundw1CUJOHw5Zb7YxtuLPwFtXUydd5r1Akw5Uijs9/a9wO5/axePNwgsb84oev tW9tBVi9r1WHcCWs0YuIpc5OT+p6vq8Qld/ygMtM/UrbCFSunHSaSuswR4du1KSlpHgd ju01fUW6Eqtmwrf1BZ43p6qdGG8gVQxh1YX67YL5syNLW+0QwHhJN3sMztj6a4O5DQKr OIJQ== X-Gm-Message-State: ALoCoQnta2CmQQlYlhk0bUIwa2GpbCGgo20kJDgzF4eAMGUhRGt/E+4CVl55+5fr56nqegvQPNHU X-Received: by 10.50.7.101 with SMTP id i5mr7102568iga.48.1380903219851; Fri, 04 Oct 2013 09:13:39 -0700 (PDT) Received: from localhost.localdomain (CPE84c9b25429ad-CM001a66677ea4.cpe.net.cable.rogers.com. [99.231.165.162]) by mx.google.com with ESMTPSA id p7sm7708171iga.3.1969.12.31.16.00.00 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 04 Oct 2013 09:13:39 -0700 (PDT) From: Christoffer Dall To: kvmarm@lists.cs.columbia.edu Subject: [PATCH v3 2/2] KVM: ARM: Transparent huge page (THP) support Date: Fri, 4 Oct 2013 17:13:21 +0100 Message-Id: <1380903201-32644-3-git-send-email-christoffer.dall@linaro.org> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1380903201-32644-1-git-send-email-christoffer.dall@linaro.org> References: <1380903201-32644-1-git-send-email-christoffer.dall@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131004_121401_502619_864DE2AE X-CRM114-Status: GOOD ( 19.07 ) X-Spam-Score: -2.6 (--) Cc: Christoffer Dall , linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Support transparent huge pages in KVM/ARM and KVM/ARM64. The transparent_hugepage_adjust is not very pretty, but this is also how it's solved on x86 and seems to be simply an artifact on how THPs behave. This should eventually be shared across architectures if possible, but that can always be changed down the road. Signed-off-by: Christoffer Dall --- Changelog[v3]: - Moved force_pte logic from previous patch - Added comment about force_pte - Fixed spelling mistake in comment Changelog[v2]: - THP handling moved into separate patch. - Minor changes and clarified comment in transparent_hugepage_adjust from Marc Z's review. --- arch/arm/kvm/mmu.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 2 deletions(-) diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 745d8b1..3719583 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -42,7 +42,7 @@ static unsigned long hyp_idmap_start; static unsigned long hyp_idmap_end; static phys_addr_t hyp_idmap_vector; -#define kvm_pmd_huge(_x) (pmd_huge(_x)) +#define kvm_pmd_huge(_x) (pmd_huge(_x) || pmd_trans_huge(_x)) static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { @@ -576,12 +576,53 @@ out: return ret; } +static bool transparent_hugepage_adjust(pfn_t *pfnp, phys_addr_t *ipap) +{ + pfn_t pfn = *pfnp; + gfn_t gfn = *ipap >> PAGE_SHIFT; + + if (PageTransCompound(pfn_to_page(pfn))) { + unsigned long mask; + /* + * The address we faulted on is backed by a transparent huge + * page. However, because we map the compound huge page and + * not the individual tail page, we need to transfer the + * refcount to the head page. We have to be careful that the + * THP doesn't start to split while we are adjusting the + * refcounts. + * + * We are sure this doesn't happen, because mmu_notifier_retry + * was successful and we are holding the mmu_lock, so if this + * THP is trying to split, it will be blocked in the mmu + * notifier before touching any of the pages, specifically + * before being able to call __split_huge_page_refcount(). + * + * We can therefore safely transfer the refcount from PG_tail + * to PG_head and switch the pfn from a tail page to the head + * page accordingly. + */ + mask = PTRS_PER_PMD - 1; + VM_BUG_ON((gfn & mask) != (pfn & mask)); + if (pfn & mask) { + *ipap &= PMD_MASK; + kvm_release_pfn_clean(pfn); + pfn &= ~mask; + kvm_get_pfn(pfn); + *pfnp = pfn; + } + + return true; + } + + return false; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_memory_slot *memslot, unsigned long fault_status) { int ret; - bool write_fault, writable, hugetlb = false; + bool write_fault, writable, hugetlb = false, force_pte = false; unsigned long mmu_seq; gfn_t gfn = fault_ipa >> PAGE_SHIFT; unsigned long hva = gfn_to_hva(vcpu->kvm, gfn); @@ -602,6 +643,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (is_vm_hugetlb_page(vma)) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; + } else { + /* + * Pages belonging to VMAs not aligned to the PMD mapping + * granularity cannot be mapped using block descriptors even + * if the pages belong to a THP for the process, because the + * stage-2 block descriptor will cover more than a single THP + * and we loose atomicity for unmapping, updates, and splits + * of the THP or other pages in the stage-2 block range. + */ + if (vma->vm_start & ~PMD_MASK) + force_pte = true; } up_read(¤t->mm->mmap_sem); @@ -629,6 +681,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, spin_lock(&kvm->mmu_lock); if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; + if (!hugetlb && !force_pte) + hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa); if (hugetlb) { pmd_t new_pmd = pfn_pmd(pfn, PAGE_S2);