From patchwork Wed Feb 1 12:52:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 13124383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9B85EC05027 for ; Wed, 1 Feb 2023 14:04:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=SoqXgxBjN4jX1xytIgqFoejIEGwlEfi7bT/rNyn2XCc=; b=0wCEQx+vwOvZ2k 4NDYxiYk0yWqzFgvIzi3ezaEt76Ft44K8xdPeQ75qPcTjUTyngTwvGK3X5Ip2bC99NZHTd1iu30RY Tw8QzfY5Z77rbadXve6A5AiF7aXyot+y0okeo2QQ7QEtrnasuPxk4NV/WctLqjYnZTAmoYz5/8feb anUJ0OXTHRtgTk9ARM+JYZ9MLTJWB8Slv7F9i/WY9E2crQ7tEvMADNFcgY1d1HQ3dHMjMc9TpJ8Qt GWD5DsVt9ELwHknRjhqW828X7U+b/N2L+zdJiI4YZy3B1xIdTyb1wxtutAsYdQHpb3eIk2ceAdXs2 J9reXbjE/As/oNrYCVpg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNDib-00CCFk-Kc; Wed, 01 Feb 2023 14:04:10 +0000 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNCi3-00BnFM-O1 for linux-arm-kernel@lists.infradead.org; Wed, 01 Feb 2023 12:59:37 +0000 Received: by mail-wr1-x430.google.com with SMTP id q5so17271709wrv.0 for ; Wed, 01 Feb 2023 04:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lU6O9CZcUVv8hYcDQ334N0zwZYT/omiCvAafRyvmTzE=; b=m/voFpeGQsUt+6NrWmVb5fj87WGpfE8oDM/HYbxPMBaboItMuRhjxJpWOC0UMmot2U HKiagg3IyUbb7RAY8EOvyDQ7kQrzP5VHWttvLSk3E6AM1J5Mbjt+FfakdBoywoJrkuev whu4Zb/6+vc3sD6va7O/OM5TjGtMaKYW8uiJxQQHP+I89qPsMHvnujrHvYUkmc7Km9Cp jw58YT7nvqAtjn66VOgsCgsENZyeq1kzuA17D18ZlCVWR6+hoVL1jC1uOJomzKkdLi5d TaBeleGJimSEybsh7xkzdNXyAeURNnapi11HfA5nZfDPdACQAqJu3gyVjm+f0KNh4XaA e1Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lU6O9CZcUVv8hYcDQ334N0zwZYT/omiCvAafRyvmTzE=; b=ErGWbjn6LW3fnlJw7O5gCPqss843jWysulC5Zodb3asY4uZ1D1v0n1oPehO/O1OX+8 oHs2oitAx/3gbc/tU1iLBcZeWIeZWlNH42K1KmmFkDUiDltxWJ0EIeYrWYAQjRuIIKeA q4yW44N9oTEFIoOCVPMytJ8KLoI9+g+YEbrkjJ9eVVdVl7DK01gt4F7xwWFtPkXppp4e 4fVEYkf3SqKnAhUQth/ca6zFkO5OyagC0lJ4Kt8Yc9l5Z/W7BEgXrcpo4aIu9euvfzgL AJJ8+3b6ov+y+lC09Q0bWkYWrlf1jVVa6bjXmHvjumLYQT22usX2udsY38riOljnr9M1 wCjA== X-Gm-Message-State: AO0yUKUJoO9B5gVk8WFcEi8F4KnPwwDij6FfFjpNCYi8pQKiDtHMbe2b /RnOgjGjUuLH5vkzhIkAA2uQhw== X-Google-Smtp-Source: AK7set8mqrMZj30mZqLkmp/Suy++GhfmsN1hQm9nbWbdJ8dYAZEUk6nRMACSn+IwYHRF/R++5hECXw== X-Received: by 2002:a05:6000:1285:b0:2bf:c58b:9cba with SMTP id f5-20020a056000128500b002bfc58b9cbamr2384179wrx.60.1675256371211; Wed, 01 Feb 2023 04:59:31 -0800 (PST) Received: from localhost.localdomain (054592b0.skybroadband.com. [5.69.146.176]) by smtp.gmail.com with ESMTPSA id m15-20020a056000024f00b002bfae16ee2fsm17972811wrz.111.2023.02.01.04.59.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 04:59:30 -0800 (PST) From: Jean-Philippe Brucker To: maz@kernel.org, catalin.marinas@arm.com, will@kernel.org, joro@8bytes.org Cc: robin.murphy@arm.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com, smostafa@google.com, dbrazdil@google.com, ryan.roberts@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, iommu@lists.linux.dev, Jean-Philippe Brucker Subject: [RFC PATCH 15/45] KVM: arm64: pkvm: Add __pkvm_host_share/unshare_dma() Date: Wed, 1 Feb 2023 12:52:59 +0000 Message-Id: <20230201125328.2186498-16-jean-philippe@linaro.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230201125328.2186498-1-jean-philippe@linaro.org> References: <20230201125328.2186498-1-jean-philippe@linaro.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230201_045931_863496_B7246220 X-CRM114-Status: GOOD ( 25.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Host pages mapped in the SMMU must not be donated to the guest or hypervisor, since the host could then use DMA to break confidentiality. Mark them shared in the host stage-2 page tables, and keep a refcount in the hyp vmemmap. Signed-off-by: Jean-Philippe Brucker --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 3 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 185 ++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 021825aee854..a363d58a998b 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -58,6 +58,7 @@ enum pkvm_component_id { PKVM_ID_HOST, PKVM_ID_HYP, PKVM_ID_GUEST, + PKVM_ID_IOMMU, }; extern unsigned long hyp_nr_cpus; @@ -72,6 +73,8 @@ int __pkvm_host_share_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu); int __pkvm_host_donate_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu); int __pkvm_guest_share_host(struct pkvm_hyp_vcpu *hyp_vcpu, u64 ipa); int __pkvm_guest_unshare_host(struct pkvm_hyp_vcpu *hyp_vcpu, u64 ipa); +int __pkvm_host_share_dma(u64 phys_addr, size_t size, bool is_ram); +int __pkvm_host_unshare_dma(u64 phys_addr, size_t size); bool addr_is_memory(phys_addr_t phys); int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 856673291d70..dcf08ce03790 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -1148,6 +1148,9 @@ static int check_share(struct pkvm_mem_share *share) case PKVM_ID_GUEST: ret = guest_ack_share(completer_addr, tx, share->completer_prot); break; + case PKVM_ID_IOMMU: + ret = 0; + break; default: ret = -EINVAL; } @@ -1185,6 +1188,9 @@ static int __do_share(struct pkvm_mem_share *share) case PKVM_ID_GUEST: ret = guest_complete_share(completer_addr, tx, share->completer_prot); break; + case PKVM_ID_IOMMU: + ret = 0; + break; default: ret = -EINVAL; } @@ -1239,6 +1245,9 @@ static int check_unshare(struct pkvm_mem_share *share) case PKVM_ID_HYP: ret = hyp_ack_unshare(completer_addr, tx); break; + case PKVM_ID_IOMMU: + ret = 0; + break; default: ret = -EINVAL; } @@ -1273,6 +1282,9 @@ static int __do_unshare(struct pkvm_mem_share *share) case PKVM_ID_HYP: ret = hyp_complete_unshare(completer_addr, tx); break; + case PKVM_ID_IOMMU: + ret = 0; + break; default: ret = -EINVAL; } @@ -1633,6 +1645,179 @@ void hyp_unpin_shared_mem(void *from, void *to) host_unlock_component(); } +static int __host_check_page_dma_shared(phys_addr_t phys_addr) +{ + int ret; + u64 hyp_addr; + + /* + * The page is already refcounted. Make sure it's owned by the host, and + * not part of the hyp pool. + */ + ret = __host_check_page_state_range(phys_addr, PAGE_SIZE, + PKVM_PAGE_SHARED_OWNED); + if (ret) + return ret; + + /* + * Refcounted and owned by host, means it's either mapped in the + * SMMU, or it's some VM/VCPU state shared with the hypervisor. + * The host has no reason to use a page for both. + */ + hyp_addr = (u64)hyp_phys_to_virt(phys_addr); + return __hyp_check_page_state_range(hyp_addr, PAGE_SIZE, PKVM_NOPAGE); +} + +static int __pkvm_host_share_dma_page(phys_addr_t phys_addr, bool is_ram) +{ + int ret; + struct hyp_page *p = hyp_phys_to_page(phys_addr); + struct pkvm_mem_share share = { + .tx = { + .nr_pages = 1, + .initiator = { + .id = PKVM_ID_HOST, + .addr = phys_addr, + }, + .completer = { + .id = PKVM_ID_IOMMU, + }, + }, + }; + + hyp_assert_lock_held(&host_mmu.lock); + hyp_assert_lock_held(&pkvm_pgd_lock); + + /* + * Some differences between handling of RAM and device memory: + * - The hyp vmemmap area for device memory is not backed by physical + * pages in the hyp page tables. + * - Device memory is unmapped automatically under memory pressure + * (host_stage2_try()) and the ownership information would be + * discarded. + * We don't need to deal with that at the moment, because the host + * cannot share or donate device memory, only RAM. + * + * Since 'is_ram' is only a hint provided by the host, we do need to + * make sure of it. + */ + if (!is_ram) + return addr_is_memory(phys_addr) ? -EINVAL : 0; + + ret = hyp_page_ref_inc_return(p); + BUG_ON(ret == 0); + if (ret < 0) + return ret; + else if (ret == 1) + ret = do_share(&share); + else + ret = __host_check_page_dma_shared(phys_addr); + + if (ret) + hyp_page_ref_dec(p); + + return ret; +} + +static int __pkvm_host_unshare_dma_page(phys_addr_t phys_addr) +{ + struct hyp_page *p = hyp_phys_to_page(phys_addr); + struct pkvm_mem_share share = { + .tx = { + .nr_pages = 1, + .initiator = { + .id = PKVM_ID_HOST, + .addr = phys_addr, + }, + .completer = { + .id = PKVM_ID_IOMMU, + }, + }, + }; + + hyp_assert_lock_held(&host_mmu.lock); + hyp_assert_lock_held(&pkvm_pgd_lock); + + if (!addr_is_memory(phys_addr)) + return 0; + + if (!hyp_page_ref_dec_and_test(p)) + return 0; + + return do_unshare(&share); +} + +/* + * __pkvm_host_share_dma - Mark host memory as used for DMA + * @phys_addr: physical address of the DMA region + * @size: size of the DMA region + * @is_ram: whether it is RAM or device memory + * + * We must not allow the host to donate pages that are mapped in the IOMMU for + * DMA. So: + * 1. Mark the host S2 entry as being owned by IOMMU + * 2. Refcount it, since a page may be mapped in multiple device address spaces. + * + * At some point we may end up needing more than the current 16 bits for + * refcounting, for example if all devices and sub-devices map the same MSI + * doorbell page. It will do for now. + */ +int __pkvm_host_share_dma(phys_addr_t phys_addr, size_t size, bool is_ram) +{ + int i; + int ret; + size_t nr_pages = size >> PAGE_SHIFT; + + if (WARN_ON(!PAGE_ALIGNED(phys_addr | size))) + return -EINVAL; + + host_lock_component(); + hyp_lock_component(); + + for (i = 0; i < nr_pages; i++) { + ret = __pkvm_host_share_dma_page(phys_addr + i * PAGE_SIZE, + is_ram); + if (ret) + break; + } + + if (ret) { + for (--i; i >= 0; --i) + __pkvm_host_unshare_dma_page(phys_addr + i * PAGE_SIZE); + } + + hyp_unlock_component(); + host_unlock_component(); + + return ret; +} + +int __pkvm_host_unshare_dma(phys_addr_t phys_addr, size_t size) +{ + int i; + int ret; + size_t nr_pages = size >> PAGE_SHIFT; + + host_lock_component(); + hyp_lock_component(); + + /* + * We end up here after the caller successfully unmapped the page from + * the IOMMU table. Which means that a ref is held, the page is shared + * in the host s2, there can be no failure. + */ + for (i = 0; i < nr_pages; i++) { + ret = __pkvm_host_unshare_dma_page(phys_addr + i * PAGE_SIZE); + if (ret) + break; + } + + hyp_unlock_component(); + host_unlock_component(); + + return ret; +} + int __pkvm_host_share_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu) { int ret;