From patchwork Wed Mar 10 17:57:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Quentin Perret X-Patchwork-Id: 12128953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31F62C433DB for ; Wed, 10 Mar 2021 18:12:57 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2B3FC64E77 for ; Wed, 10 Mar 2021 18:12:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2B3FC64E77 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:Cc:To:From:Subject:References:Mime-Version: Message-Id:In-Reply-To:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=azgca4pySfO1lwNAU3gJDXhE3EpLmj0EnVQZp2Cph3w=; b=rCJF41IynJi7eI OtuEAJIG6VysgIC60amQqonEWp2vyxfPbYfzMTNkNBsqvrGBJRU0w2rzbF6Ox2ybyK2ms9kvhNHV7 6RX9/+ChnYFBzVWUFGXgYkjGPWhuK6GutuBBP8KuaBnjY0Tu9t2HBN9eGqYvVqorTkxgnAbJgRXSI 5VqEaadyc8lfuURAaxrhIXDrG56K8YGRkXUC3TVKL4h7F16vBtV/HF7W7R1BK0sabD+mmPBFYPtOE 2RpX9UXUV9yQUtCcXPcujAFjJaUaiVNrnGL7PejVbQnXt4G4ga4pkWI2dJb4rrhCfvhKyjTc6APeq wb/J6W4g63LlC6Q93k4Q==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lK3HT-007Sj9-75; Wed, 10 Mar 2021 18:09:59 +0000 Received: from mail-qt1-x849.google.com ([2607:f8b0:4864:20::849]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lK36o-007OTz-C5 for linux-arm-kernel@lists.infradead.org; Wed, 10 Mar 2021 17:59:00 +0000 Received: by mail-qt1-x849.google.com with SMTP id v19so13522779qtw.19 for ; Wed, 10 Mar 2021 09:58:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=oJAEWGythqEowVh0n8HMKtwcniboC/S1aRtheNM7IbE=; b=cJzFnkrAT9fGjpNBXYhuuJPuJr8xf3G3JewcI+WPvURd6Io9t0i5rlaNRjeWqYPx/Y eka8vAXibXBnvp1mXYExjSyRNZKktSJI4VnhbJcjj7XalIQPeivVnmJRgECvhF7yYktc QH+jTNqGz17MxFaT24apKPJgDPJ84YCqDw0AQKQoJBDrUW4FUxuafKRVaqUASQCJgt4W zDXjXTbKU655Hi06aEooIs+AEcvNKClEJAQFk4k07xLpfajULWPSGUyKHd3q96HcDkfC OiBee+YVNUnM9DkSFpdLAcxla0asRJhapKsob7jGQ++zkQptlutrfxjQG2DUIKPAUHlR +Avw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=oJAEWGythqEowVh0n8HMKtwcniboC/S1aRtheNM7IbE=; b=f9F08UbauTVQarnBG4ny6B7+I3WzvyiWkYeJKVMGMQH7XEt/l3Q6uoICEeVq8mv2DQ RL6YBqKN3CQH221itAN+l6snFnjOyF2299VF4sVCnyKlfoUWZ1ii3baGhCIIMc1g9182 c81Lo+87ZIMJ3pCNf4m2hrwFAsRSoqakgxUoIPGhGCGMx9vwaUBdLoTHpRt9+45D1FW9 DN9nFMrMktDYUG3RBJ0ESsX7ZAh5vjhoOaUi1jxhv+rTgLA4V5hqCYm2a/q2oy3z1321 mTb4CYLXLRRYOPGlqdOINzihJ3SpsMIxvUhPcLIWZCNOLcBgbdtg2bwmaS5EeVzb6jBG eK6w== X-Gm-Message-State: AOAM5300Pz1VDcFJW9X1HplM4AtVz9fYq8uDLvN3Ek05gcmGM9bzr33K KjzI4v0k4RfR5OUZQFxT2VN/ig8RE9ag X-Google-Smtp-Source: ABdhPJzEM5vQ7ta5jwY/6r7uIre+Ch2NIuo2JzdxZFwKwwT7Uiy3AfJO64LToZY/HChaNQ0THAG1P5zpU4GU X-Received: from r2d2-qp.c.googlers.com ([fda3:e722:ac3:10:28:9cb1:c0a8:1652]) (user=qperret job=sendgmr) by 2002:a0c:b92f:: with SMTP id u47mr4092501qvf.8.1615399136610; Wed, 10 Mar 2021 09:58:56 -0800 (PST) Date: Wed, 10 Mar 2021 17:57:45 +0000 In-Reply-To: <20210310175751.3320106-1-qperret@google.com> Message-Id: <20210310175751.3320106-29-qperret@google.com> Mime-Version: 1.0 References: <20210310175751.3320106-1-qperret@google.com> X-Mailer: git-send-email 2.30.1.766.gb4fecdf3b7-goog Subject: [PATCH v4 28/34] KVM: arm64: Use page-table to track page ownership From: Quentin Perret To: catalin.marinas@arm.com, will@kernel.org, maz@kernel.org, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com Cc: android-kvm@google.com, linux-kernel@vger.kernel.org, kernel-team@android.com, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, tabba@google.com, mark.rutland@arm.com, dbrazdil@google.com, mate.toth-pal@arm.com, seanjc@google.com, qperret@google.com, robh+dt@kernel.org, ardb@kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210310_175858_785725_C734FE28 X-CRM114-Status: GOOD ( 26.75 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org As the host stage 2 will be identity mapped, all the .hyp memory regions and/or memory pages donated to protected guestis will have to marked invalid in the host stage 2 page-table. At the same time, the hypervisor will need a way to track the ownership of each physical page to ensure memory sharing or donation between entities (host, guests, hypervisor) is legal. In order to enable this tracking at EL2, let's use the host stage 2 page-table itself. The idea is to use the top bits of invalid mappings to store the unique identifier of the page owner. The page-table owner (the host) gets identifier 0 such that, at boot time, it owns the entire IPA space as the pgd starts zeroed. Provide kvm_pgtable_stage2_set_owner() which allows to modify the ownership of pages in the host stage 2. It re-uses most of the map() logic, but ends up creating invalid mappings instead. This impacts how we do refcount as we now need to count invalid mappings when they are used for ownership tracking. Signed-off-by: Quentin Perret --- arch/arm64/include/asm/kvm_pgtable.h | 21 +++++++ arch/arm64/kvm/hyp/pgtable.c | 92 ++++++++++++++++++++++++---- 2 files changed, 101 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 4ae19247837b..b09af4612656 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -238,6 +238,27 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, enum kvm_pgtable_prot prot, void *mc); +/** + * kvm_pgtable_stage2_set_owner() - Annotate invalid mappings with metadata + * encoding the ownership of a page in the + * IPA space. + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init(). + * @addr: Intermediate physical address at which to place the annotation. + * @size: Size of the IPA range to annotate. + * @mc: Cache of pre-allocated and zeroed memory from which to allocate + * page-table pages. + * @owner_id: Unique identifier for the owner of the page. + * + * The page-table owner has identifier 0. This function can be used to mark + * portions of the IPA space as owned by other entities. When a stage 2 is used + * with identity-mappings, these annotations allow to use the page-table data + * structure as a simple rmap. + * + * Return: 0 on success, negative error code on failure. + */ +int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, + void *mc, u32 owner_id); + /** * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 page-table. * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init(). diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index f37b4179b880..e4670b639726 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -48,6 +48,8 @@ KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \ KVM_PTE_LEAF_ATTR_HI_S2_XN) +#define KVM_INVALID_PTE_OWNER_MASK GENMASK(63, 32) + struct kvm_pgtable_walk_data { struct kvm_pgtable *pgt; struct kvm_pgtable_walker *walker; @@ -186,6 +188,11 @@ static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level) return pte; } +static kvm_pte_t kvm_init_invalid_leaf_owner(u32 owner_id) +{ + return FIELD_PREP(KVM_INVALID_PTE_OWNER_MASK, owner_id); +} + static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag) @@ -440,6 +447,7 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt) struct stage2_map_data { u64 phys; kvm_pte_t attr; + u32 owner_id; kvm_pte_t *anchor; kvm_pte_t *childp; @@ -506,6 +514,24 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot, return 0; } +static bool stage2_is_permission_change(kvm_pte_t old, kvm_pte_t new) +{ + if (!kvm_pte_valid(old) || !kvm_pte_valid(new)) + return false; + + return !((old ^ new) & (~KVM_PTE_LEAF_ATTR_S2_PERMS)); +} + +static bool stage2_pte_is_counted(kvm_pte_t pte) +{ + /* + * The refcount tracks valid entries as well as invalid entries if they + * encode ownership of a page to another entity than the page-table + * owner, whose id is 0. + */ + return !!pte; +} + static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) @@ -517,28 +543,36 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, if (!kvm_block_mapping_supported(addr, end, phys, level)) return -E2BIG; - new = kvm_init_valid_leaf_pte(phys, data->attr, level); - if (kvm_pte_valid(old)) { + if (kvm_pte_valid(data->attr)) + new = kvm_init_valid_leaf_pte(phys, data->attr, level); + else + new = kvm_init_invalid_leaf_owner(data->owner_id); + + if (stage2_pte_is_counted(old)) { /* * Skip updating the PTE if we are trying to recreate the exact * same mapping or only change the access permissions. Instead, * the vCPU will exit one more time from guest if still needed * and then go through the path of relaxing permissions. */ - if (!((old ^ new) & (~KVM_PTE_LEAF_ATTR_S2_PERMS))) + if (stage2_is_permission_change(old, new)) return -EAGAIN; /* - * There's an existing different valid leaf entry, so perform - * break-before-make. + * Clear the existing PTE, and perform break-before-make with + * TLB maintenance if it was valid. */ kvm_clear_pte(ptep); - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level); + if (kvm_pte_valid(old)) { + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, + level); + } mm_ops->put_page(ptep); } smp_store_release(ptep, new); - mm_ops->get_page(ptep); + if (stage2_pte_is_counted(new)) + mm_ops->get_page(ptep); data->phys += granule; return 0; } @@ -574,7 +608,7 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, int ret; if (data->anchor) { - if (kvm_pte_valid(pte)) + if (stage2_pte_is_counted(pte)) mm_ops->put_page(ptep); return 0; @@ -599,9 +633,10 @@ static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, * a table. Accesses beyond 'end' that fall within the new table * will be mapped lazily. */ - if (kvm_pte_valid(pte)) { + if (stage2_pte_is_counted(pte)) { kvm_clear_pte(ptep); - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level); + if (kvm_pte_valid(pte)) + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level); mm_ops->put_page(ptep); } @@ -683,6 +718,7 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, .mmu = pgt->mmu, .memcache = mc, .mm_ops = pgt->mm_ops, + .owner_id = 0, }; struct kvm_pgtable_walker walker = { .cb = stage2_map_walker, @@ -696,6 +732,33 @@ int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, if (ret) return ret; + /* Set the valid flag to distinguish with the set_owner() path. */ + map_data.attr |= KVM_PTE_VALID; + + ret = kvm_pgtable_walk(pgt, addr, size, &walker); + dsb(ishst); + return ret; +} + +int kvm_pgtable_stage2_set_owner(struct kvm_pgtable *pgt, u64 addr, u64 size, + void *mc, u32 owner_id) +{ + int ret; + struct stage2_map_data map_data = { + .mmu = pgt->mmu, + .memcache = mc, + .mm_ops = pgt->mm_ops, + .owner_id = owner_id, + .attr = 0, + }; + struct kvm_pgtable_walker walker = { + .cb = stage2_map_walker, + .flags = KVM_PGTABLE_WALK_TABLE_PRE | + KVM_PGTABLE_WALK_LEAF | + KVM_PGTABLE_WALK_TABLE_POST, + .arg = &map_data, + }; + ret = kvm_pgtable_walk(pgt, addr, size, &walker); dsb(ishst); return ret; @@ -725,8 +788,13 @@ static int stage2_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, kvm_pte_t pte = *ptep, *childp = NULL; bool need_flush = false; - if (!kvm_pte_valid(pte)) + if (!kvm_pte_valid(pte)) { + if (stage2_pte_is_counted(pte)) { + kvm_clear_pte(ptep); + mm_ops->put_page(ptep); + } return 0; + } if (kvm_pte_table(pte, level)) { childp = kvm_pte_follow(pte, mm_ops); @@ -948,7 +1016,7 @@ static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct kvm_pgtable_mm_ops *mm_ops = arg; kvm_pte_t pte = *ptep; - if (!kvm_pte_valid(pte)) + if (!stage2_pte_is_counted(pte)) return 0; mm_ops->put_page(ptep);