From patchwork Mon Nov 4 13:31:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Quentin Perret X-Patchwork-Id: 13861439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECBD3D132C8 for ; Mon, 4 Nov 2024 13:47:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=zQ2UfOd5hZAeaPLHJ1vPNSgFOo3Tgpu5wQh3N2q+oTo=; b=XEzxPeEZneD138tgeoUxqVWkwS VTf5zYUp0nnS0Vh0MvDbr/+q85eDKovWRAIUmphaibrJP3eq+NmMANnjxol9fz83K9kVDPYDA22US W4hFBzj70ljbIydWRoiVUgMU58hEaDAI3duF7RAFxOJUxYvoPAcasSmpSTlPBHM9TdPZjkVT1EGZ9 0GI6hrtCKXZfPl+Ezdaomf8t/ZQOZvmOkz1h56jgeN2vx/IfDMNbORykqNCITGPeLkpADX2sspmjW PqCEzLTLceCSHetZITSyGM6JVEnkqco/Zy22MMuM/UvWAxeq8YrxU/0Wc2fLOahLK+oKp9dk2x6R/ yJheiksg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t7xQW-0000000Dvjd-2Zp6; Mon, 04 Nov 2024 13:47:28 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t7xBq-0000000DsHr-3jfe for linux-arm-kernel@lists.infradead.org; Mon, 04 Nov 2024 13:32:20 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-6ea6aa3b68bso45857977b3.3 for ; Mon, 04 Nov 2024 05:32:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730727138; x=1731331938; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zQ2UfOd5hZAeaPLHJ1vPNSgFOo3Tgpu5wQh3N2q+oTo=; b=uUNs2bsU1uDcGj7/Endi87MOOXB9RsmRx+lMGOYtPEqPAAJqtFW6ExEjHLPsm/wT75 544Mklu27U2QeY74VOtUBXbfVSy5OmulpKQqwF9K25gZrdEbu7Iz/ZW/7LxeiQVsfPb9 cL9oBhGRhwWfTxiymykGMPPuBb10ariTAb5PlOljzlAW9pvrYZ84VJqUV+rg4Kf+x5cL JcV/KteIHGSU204yor7YRohPZjLAWhCQvvtdGPSJ5Z4SxdxWQSrLSciZFeZvwCkeeFPY 7SfKqvfrHEfMAfTJddCaL+3cedx2YtYRm/xj087W+6VEzdvpa50/XVSvHxvRlkbPhyTU qjpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730727138; x=1731331938; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zQ2UfOd5hZAeaPLHJ1vPNSgFOo3Tgpu5wQh3N2q+oTo=; b=OrVbNViedAgX++kxLwR3koJgMc/w0gaskKZwNwMrkDPPuyARUIE3hC7jIDJB4qo66h aimgAL2+u2+Bc4E5bCyPCpHpDbcL25h9GENeN2TK0FPNQPSzuOwNO6mwO3mOQOoN5xxq oDVutdo7BET45KR0pAZkiMyRxQ5OkLCmUhpr7Psd++ohfTXmrgBPswGe44e0Fy+QRID7 sO9zf1OnGLGxqX/Efh4D1zn9PHaguW1m8crSh13KKI+SnYjYVnbpY8HXnow6LCBMacrp pMdN1KbjfA1Dbq1OBR3vXpTlZeUy1W+vI5AhD/05elGlUMOVoQi5AFMPR5ILObHW34Xb NQww== X-Forwarded-Encrypted: i=1; AJvYcCUmnUhJUK6zSM72oR2M9f7UHWTPX3ccm12FoklPvpT4lCoh3TyfpBLai5PVdvTw9iRBnDuDONPzZltk6zoY5/Yo@lists.infradead.org X-Gm-Message-State: AOJu0YyDdnx8XOmonllZl8QD5l0VriM0L67lEupxm9Dt3iLh4yQCRV2X 6iI1ZX0oujBXFdFjJWRQbZFA0v1hNC2dP2nlDCXMC0mWkBTa/ajfZorJa5D2R6HYLtDOvwX/Tm/ tTbrm3w== X-Google-Smtp-Source: AGHT+IF49UTXAe8MpaL0rEMeOO4EMZEqcc7t401Y5lsJ/es+OKsAWaYirm7qQpvT/q6K58FZd8rxMOICH29X X-Received: from big-boi.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:129]) (user=qperret job=sendgmr) by 2002:a05:690c:9a0e:b0:6ea:70f7:2c38 with SMTP id 00721157ae682-6ea70f733ddmr1082897b3.4.1730727137419; Mon, 04 Nov 2024 05:32:17 -0800 (PST) Date: Mon, 4 Nov 2024 13:31:50 +0000 In-Reply-To: <20241104133204.85208-1-qperret@google.com> Mime-Version: 1.0 References: <20241104133204.85208-1-qperret@google.com> X-Mailer: git-send-email 2.47.0.163.g1226f6d8fa-goog Message-ID: <20241104133204.85208-5-qperret@google.com> Subject: [PATCH 04/18] KVM: arm64: Move host page ownership tracking to the hyp vmemmap From: Quentin Perret To: Marc Zyngier , Oliver Upton , Joey Gouly , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon Cc: Fuad Tabba , Vincent Donnefort , Sebastian Ene , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241104_053218_958258_099B77F2 X-CRM114-Status: GOOD ( 24.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We currently store part of the page-tracking state in PTE software bits for the host, guests and the hypervisor. This is sub-optimal when e.g. sharing pages as this forces to break block mappings purely to support this software tracking. This causes an unnecessarily fragmented stage-2 page-table for the host in particular when it shares pages with Secure, which can lead to measurable regressions. Moreover, having this state stored in the page-table forces us to do multiple costly walks on the page transition path, hence causing overhead. In order to work around these problems, move the host-side page-tracking logic from SW bits in its stage-2 PTEs to the hypervisor's vmemmap. Signed-off-by: Quentin Perret --- arch/arm64/kvm/hyp/include/nvhe/memory.h | 6 +- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 94 ++++++++++++++++-------- arch/arm64/kvm/hyp/nvhe/setup.c | 7 +- 3 files changed, 71 insertions(+), 36 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h index 88cb8ff9e769..08f3a0416d4c 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h @@ -8,7 +8,7 @@ #include /* - * SW bits 0-1 are reserved to track the memory ownership state of each page: + * Bits 0-1 are reserved to track the memory ownership state of each page: * 00: The page is owned exclusively by the page-table owner. * 01: The page is owned by the page-table owner, but is shared * with another entity. @@ -44,7 +44,9 @@ static inline enum pkvm_page_state pkvm_getstate(enum kvm_pgtable_prot prot) struct hyp_page { u16 refcount; u8 order; - u8 reserved; + + /* Host (non-meta) state. Guarded by the host stage-2 lock. */ + enum pkvm_page_state host_state : 8; }; extern u64 __hyp_vmemmap; diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index caba3e4bd09e..1595081c4f6b 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -201,8 +201,8 @@ static void *guest_s2_zalloc_page(void *mc) memset(addr, 0, PAGE_SIZE); p = hyp_virt_to_page(addr); - memset(p, 0, sizeof(*p)); p->refcount = 1; + p->order = 0; return addr; } @@ -268,6 +268,7 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd) void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc) { + struct hyp_page *page; void *addr; /* Dump all pgtable pages in the hyp_pool */ @@ -279,7 +280,9 @@ void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc) /* Drain the hyp_pool into the memcache */ addr = hyp_alloc_pages(&vm->pool, 0); while (addr) { - memset(hyp_virt_to_page(addr), 0, sizeof(struct hyp_page)); + page = hyp_virt_to_page(addr); + page->refcount = 0; + page->order = 0; push_hyp_memcache(mc, addr, hyp_virt_to_phys); WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1)); addr = hyp_alloc_pages(&vm->pool, 0); @@ -382,19 +385,25 @@ bool addr_is_memory(phys_addr_t phys) return !!find_mem_range(phys, &range); } -static bool addr_is_allowed_memory(phys_addr_t phys) +static bool is_in_mem_range(u64 addr, struct kvm_mem_range *range) +{ + return range->start <= addr && addr < range->end; +} + +static int range_is_allowed_memory(u64 start, u64 end) { struct memblock_region *reg; struct kvm_mem_range range; - reg = find_mem_range(phys, &range); + /* Can't check the state of both MMIO and memory regions at once */ + reg = find_mem_range(start, &range); + if (!is_in_mem_range(end - 1, &range)) + return -EINVAL; - return reg && !(reg->flags & MEMBLOCK_NOMAP); -} + if (!reg || reg->flags & MEMBLOCK_NOMAP) + return -EPERM; -static bool is_in_mem_range(u64 addr, struct kvm_mem_range *range) -{ - return range->start <= addr && addr < range->end; + return 0; } static bool range_is_memory(u64 start, u64 end) @@ -454,8 +463,11 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) if (kvm_pte_valid(pte)) return -EAGAIN; - if (pte) + if (pte) { + WARN_ON(addr_is_memory(addr) && + !(hyp_phys_to_page(addr)->host_state & PKVM_NOPAGE)); return -EPERM; + } do { u64 granule = kvm_granule_size(level); @@ -477,10 +489,29 @@ int host_stage2_idmap_locked(phys_addr_t addr, u64 size, return host_stage2_try(__host_stage2_idmap, addr, addr + size, prot); } +static void __host_update_page_state(phys_addr_t addr, u64 size, enum pkvm_page_state state) +{ + phys_addr_t end = addr + size; + for (; addr < end; addr += PAGE_SIZE) + hyp_phys_to_page(addr)->host_state = state; +} + int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id) { - return host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt, - addr, size, &host_s2_pool, owner_id); + int ret; + + ret = host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt, + addr, size, &host_s2_pool, owner_id); + if (ret || !addr_is_memory(addr)) + return ret; + + /* Don't forget to update the vmemmap tracking for the host */ + if (owner_id == PKVM_ID_HOST) + __host_update_page_state(addr, size, PKVM_PAGE_OWNED); + else + __host_update_page_state(addr, size, PKVM_NOPAGE); + + return 0; } static bool host_stage2_force_pte_cb(u64 addr, u64 end, enum kvm_pgtable_prot prot) @@ -604,35 +635,38 @@ static int check_page_state_range(struct kvm_pgtable *pgt, u64 addr, u64 size, return kvm_pgtable_walk(pgt, addr, size, &walker); } -static enum pkvm_page_state host_get_page_state(kvm_pte_t pte, u64 addr) -{ - if (!addr_is_allowed_memory(addr)) - return PKVM_NOPAGE; - - if (!kvm_pte_valid(pte) && pte) - return PKVM_NOPAGE; - - return pkvm_getstate(kvm_pgtable_stage2_pte_prot(pte)); -} - static int __host_check_page_state_range(u64 addr, u64 size, enum pkvm_page_state state) { - struct check_walk_data d = { - .desired = state, - .get_page_state = host_get_page_state, - }; + u64 end = addr + size; + int ret; + + ret = range_is_allowed_memory(addr, end); + if (ret) + return ret; hyp_assert_lock_held(&host_mmu.lock); - return check_page_state_range(&host_mmu.pgt, addr, size, &d); + for (; addr < end; addr += PAGE_SIZE) { + if (hyp_phys_to_page(addr)->host_state != state) + return -EPERM; + } + + return 0; } static int __host_set_page_state_range(u64 addr, u64 size, enum pkvm_page_state state) { - enum kvm_pgtable_prot prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, state); + if (hyp_phys_to_page(addr)->host_state & PKVM_NOPAGE) { + int ret = host_stage2_idmap_locked(addr, size, PKVM_HOST_MEM_PROT); - return host_stage2_idmap_locked(addr, size, prot); + if (ret) + return ret; + } + + __host_update_page_state(addr, size, state); + + return 0; } static int host_request_owned_transition(u64 *completer_addr, diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 174007f3fadd..c315710f57ad 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -198,7 +198,6 @@ static void hpool_put_page(void *addr) static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx, enum kvm_pgtable_walk_flags visit) { - enum kvm_pgtable_prot prot; enum pkvm_page_state state; phys_addr_t phys; @@ -221,16 +220,16 @@ static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx, case PKVM_PAGE_OWNED: return host_stage2_set_owner_locked(phys, PAGE_SIZE, PKVM_ID_HYP); case PKVM_PAGE_SHARED_OWNED: - prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_BORROWED); + hyp_phys_to_page(phys)->host_state = PKVM_PAGE_SHARED_BORROWED; break; case PKVM_PAGE_SHARED_BORROWED: - prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_OWNED); + hyp_phys_to_page(phys)->host_state = PKVM_PAGE_SHARED_OWNED; break; default: return -EINVAL; } - return host_stage2_idmap_locked(phys, PAGE_SIZE, prot); + return 0; } static int fix_hyp_pgtable_refcnt_walker(const struct kvm_pgtable_visit_ctx *ctx,