[v5,06/16] x86/tdx: Make DMA pages shared

Message ID	20211009003711.1390019-7-sathyanarayanan.kuppuswamy@linux.intel.com (mailing list archive)
State	Not Applicable
Delegated to:	Bjorn Helgaas
Headers	show Return-Path: <linux-pci-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3234C433EF for <linux-pci@archiver.kernel.org>; Sat, 9 Oct 2021 00:37:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD1EB60F90 for <linux-pci@archiver.kernel.org>; Sat, 9 Oct 2021 00:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244178AbhJIAjp (ORCPT <rfc822;linux-pci@archiver.kernel.org>); Fri, 8 Oct 2021 20:39:45 -0400 Received: from mga02.intel.com ([134.134.136.20]:5254 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244087AbhJIAjg (ORCPT <rfc822;linux-pci@vger.kernel.org>); Fri, 8 Oct 2021 20:39:36 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10131"; a="213756506" X-IronPort-AV: E=Sophos;i="5.85,358,1624345200"; d="scan'208";a="213756506" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Oct 2021 17:37:39 -0700 X-IronPort-AV: E=Sophos;i="5.85,358,1624345200"; d="scan'208";a="624905358" Received: from dmsojoza-mobl3.amr.corp.intel.com (HELO skuppusw-desk1.amr.corp.intel.com) ([10.251.135.62]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Oct 2021 17:37:37 -0700 From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> To: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Peter Zijlstra <peterz@infradead.org>, Andy Lutomirski <luto@kernel.org>, Bjorn Helgaas <bhelgaas@google.com>, Richard Henderson <rth@twiddle.net>, Thomas Bogendoerfer <tsbogend@alpha.franken.de>, James E J Bottomley <James.Bottomley@HansenPartnership.com>, Helge Deller <deller@gmx.de>, "David S . Miller" <davem@davemloft.net>, Arnd Bergmann <arnd@arndb.de>, Jonathan Corbet <corbet@lwn.net>, "Michael S . Tsirkin" <mst@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, David Hildenbrand <david@redhat.com>, Andrea Arcangeli <aarcange@redhat.com>, Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter H Anvin <hpa@zytor.com>, Dave Hansen <dave.hansen@intel.com>, Tony Luck <tony.luck@intel.com>, Dan Williams <dan.j.williams@intel.com>, Andi Kleen <ak@linux.intel.com>, Kirill Shutemov <kirill.shutemov@linux.intel.com>, Sean Christopherson <seanjc@google.com>, Kuppuswamy Sathyanarayanan <knsathya@kernel.org>, Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>, x86@kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, virtualization@lists.linux-foundation.org Subject: [PATCH v5 06/16] x86/tdx: Make DMA pages shared Date: Fri, 8 Oct 2021 17:37:01 -0700 Message-Id: <20211009003711.1390019-7-sathyanarayanan.kuppuswamy@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211009003711.1390019-1-sathyanarayanan.kuppuswamy@linux.intel.com> References: <20211009003711.1390019-1-sathyanarayanan.kuppuswamy@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-pci.vger.kernel.org> X-Mailing-List: linux-pci@vger.kernel.org
Series	Add TDX Guest Support (shared-mm support) \| expand [v5,00/16] Add TDX Guest Support (shared-mm support) [v5,01/16] x86/mm: Move force_dma_unencrypted() to common code [v5,02/16] x86/tdx: Get TD execution environment information via TDINFO [v5,03/16] x86/tdx: Exclude Shared bit from physical_mask [v5,04/16] x86/tdx: Make pages shared in ioremap() [v5,05/16] x86/tdx: Add helper to do MapGPA hypercall [v5,06/16] x86/tdx: Make DMA pages shared [v5,07/16] x86/kvm: Use bounce buffers for TD guest [v5,08/16] x86/tdx: ioapic: Add shared bit for IOAPIC base address [v5,09/16] x86/tdx: Enable shared memory confidential guest flags for TDX guest [v5,10/16] PCI: Consolidate pci_iomap_range(), pci_iomap_wc_range() [v5,11/16] asm/io.h: Add ioremap_host_shared fallback [v5,12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range() [v5,13/16] PCI: Mark MSI data shared [v5,14/16] virtio: Use shared mappings for virtio PCI devices [v5,15/16] x86/tdx: Implement ioremap_host_shared for x86 [v5,16/16] x86/tdx: Add cmdline option to force use of ioremap_host_shared

Message ID

20211009003711.1390019-7-sathyanarayanan.kuppuswamy@linux.intel.com (mailing list archive)

State

Not Applicable

Delegated to:

Bjorn Helgaas

Headers

From: Kuppuswamy Sathyanarayanan
        <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
        Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
        Peter Zijlstra <peterz@infradead.org>,
        Andy Lutomirski <luto@kernel.org>,
        Bjorn Helgaas <bhelgaas@google.com>,
        Richard Henderson <rth@twiddle.net>,
        Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
        James E J Bottomley <James.Bottomley@HansenPartnership.com>,
        Helge Deller <deller@gmx.de>,
        "David S . Miller" <davem@davemloft.net>,
        Arnd Bergmann <arnd@arndb.de>,
        Jonathan Corbet <corbet@lwn.net>,
        "Michael S . Tsirkin" <mst@redhat.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        David Hildenbrand <david@redhat.com>,
        Andrea Arcangeli <aarcange@redhat.com>,
        Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter H Anvin <hpa@zytor.com>, Dave Hansen <dave.hansen@intel.com>,
        Tony Luck <tony.luck@intel.com>,
        Dan Williams <dan.j.williams@intel.com>,
        Andi Kleen <ak@linux.intel.com>,
        Kirill Shutemov <kirill.shutemov@linux.intel.com>,
        Sean Christopherson <seanjc@google.com>,
        Kuppuswamy Sathyanarayanan <knsathya@kernel.org>,
        Kuppuswamy Sathyanarayanan
        <sathyanarayanan.kuppuswamy@linux.intel.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
        linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org,
        linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org,
        linux-arch@vger.kernel.org, linux-doc@vger.kernel.org,
        virtualization@lists.linux-foundation.org
Subject: [PATCH v5 06/16] x86/tdx: Make DMA pages shared
Date: Fri,  8 Oct 2021 17:37:01 -0700
Message-Id: 
 <20211009003711.1390019-7-sathyanarayanan.kuppuswamy@linux.intel.com>
In-Reply-To: 
 <20211009003711.1390019-1-sathyanarayanan.kuppuswamy@linux.intel.com>
References: 
 <20211009003711.1390019-1-sathyanarayanan.kuppuswamy@linux.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Add TDX Guest Support (shared-mm support) | expand

Commit Message

Kuppuswamy Sathyanarayanan Oct. 9, 2021, 12:37 a.m. UTC

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Just like MKTME, TDX reassigns bits of the physical address for
metadata.  MKTME used several bits for an encryption KeyID. TDX
uses a single bit in guests to communicate whether a physical page
should be protected by TDX as private memory (bit set to 0) or
unprotected and shared with the VMM (bit set to 1).

__set_memory_enc_dec() is now aware about TDX and sets Shared bit
accordingly following with relevant TDX hypercall.

Also, Do TDX_ACCEPT_PAGE on every 4k page after mapping the GPA range
when converting memory to private. Using 4k page size limit is due
to current TDX spec restriction. Also, If the GPA (range) was
already mapped as an active, private page, the host VMM may remove
the private page from the TD by following the “Removing TD Private
Pages” sequence in the Intel TDX-module specification [1] to safely
block the mapping(s), flush the TLB and cache, and remove the
mapping(s).

BUG() if TDX_ACCEPT_PAGE fails (except "previously accepted page" case)
, as the guest is completely hosed if it can't access memory. 

[1] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-module-1eas-v0.85.039.pdf

Tested-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v4:
 * Renamed tdg_accept_page() to tdx_accept_page().
 * Added required comments to tdx_accept_page().
 * Replaced prot_guest_has() to cc_guest_has().

Changes since v3:
 * Rebased on top of Tom Lendacky's protected guest
   changes (https://lore.kernel.org/patchwork/cover/1468760/)
 * Fixed TDX_PAGE_ALREADY_ACCEPTED error code as per latest
   spec update.

Changes since v1:
 * Removed "we" or "I" usages in comment section.
 * Replaced is_tdx_guest() checks with prot_guest_has() checks.

 arch/x86/include/asm/pgtable.h   |  1 +
 arch/x86/kernel/tdx.c            | 45 ++++++++++++++++++++++++++++----
 arch/x86/mm/mem_encrypt_common.c | 11 +++++++-
 arch/x86/mm/pat/set_memory.c     | 45 +++++++++++++++++++++++++++-----
 4 files changed, 89 insertions(+), 13 deletions(-)

Comments

Tom Lendacky Oct. 20, 2021, 4:33 p.m. UTC | #1

On 10/8/21 7:37 PM, Kuppuswamy Sathyanarayanan wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Just like MKTME, TDX reassigns bits of the physical address for
> metadata.  MKTME used several bits for an encryption KeyID. TDX
> uses a single bit in guests to communicate whether a physical page
> should be protected by TDX as private memory (bit set to 0) or
> unprotected and shared with the VMM (bit set to 1).
> 
> __set_memory_enc_dec() is now aware about TDX and sets Shared bit
> accordingly following with relevant TDX hypercall.
> 
> Also, Do TDX_ACCEPT_PAGE on every 4k page after mapping the GPA range
> when converting memory to private. Using 4k page size limit is due
> to current TDX spec restriction. Also, If the GPA (range) was
> already mapped as an active, private page, the host VMM may remove
> the private page from the TD by following the “Removing TD Private
> Pages” sequence in the Intel TDX-module specification [1] to safely
> block the mapping(s), flush the TLB and cache, and remove the
> mapping(s).
> 
> BUG() if TDX_ACCEPT_PAGE fails (except "previously accepted page" case)
> , as the guest is completely hosed if it can't access memory.
> 
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Ftdx-module-1eas-v0.85.039.pdf&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7C0e667adf5a4042abce3908d98abd07a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693367201703893%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=UGxQ9xBjWsmev7PetX%2BuS0RChkAXyaH7q6JHO9ZiUtY%3D&amp;reserved=0
> 
> Tested-by: Kai Huang <kai.huang@linux.intel.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

...

> diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
> index f063c885b0a5..119a9056efbb 100644
> --- a/arch/x86/mm/mem_encrypt_common.c
> +++ b/arch/x86/mm/mem_encrypt_common.c
> @@ -9,9 +9,18 @@
>   
>   #include <asm/mem_encrypt_common.h>
>   #include <linux/dma-mapping.h>
> +#include <linux/cc_platform.h>
>   
>   /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
>   bool force_dma_unencrypted(struct device *dev)
>   {
> -	return amd_force_dma_unencrypted(dev);
> +	if (cc_platform_has(CC_ATTR_GUEST_TDX) &&
> +	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
> +		return true;
> +
> +	if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) ||
> +	    cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
> +		return amd_force_dma_unencrypted(dev);
> +
> +	return false;

Assuming the original force_dma_unencrypted() function was moved here or 
cc_platform.c, then you shouldn't need any changes. Both SEV and TDX 
require true be returned if CC_ATTR_GUEST_MEM_ENCRYPT returns true. And 
then TDX should never return true for CC_ATTR_HOST_MEM_ENCRYPT.

>   }
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 527957586f3c..6c531d5cb5fd 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -30,6 +30,7 @@
>   #include <asm/proto.h>
>   #include <asm/memtype.h>
>   #include <asm/set_memory.h>
> +#include <asm/tdx.h>
>   
>   #include "../mm_internal.h"
>   
> @@ -1981,8 +1982,10 @@ int set_memory_global(unsigned long addr, int numpages)
>   				    __pgprot(_PAGE_GLOBAL), 0);
>   }
>   
> -static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> +static int __set_memory_protect(unsigned long addr, int numpages, bool protect)
>   {
> +	pgprot_t mem_protected_bits, mem_plain_bits;
> +	enum tdx_map_type map_type;
>   	struct cpa_data cpa;
>   	int ret;
>   
> @@ -1997,8 +2000,25 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>   	memset(&cpa, 0, sizeof(cpa));
>   	cpa.vaddr = &addr;
>   	cpa.numpages = numpages;
> -	cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
> -	cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
> +
> +	if (cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT)) {
> +		mem_protected_bits = __pgprot(0);
> +		mem_plain_bits = pgprot_cc_shared_mask();

How about having generic versions for both shared and private that return 
the proper value for SEV or TDX. Then this remains looking similar to as 
it does now, just replacing the __pgprot() calls with the appropriate 
pgprot_cc_{shared,private}_mask().

Thanks,
Tom

> +	} else {
> +		mem_protected_bits = __pgprot(_PAGE_ENC);
> +		mem_plain_bits = __pgprot(0);
> +	}
> +
> +	if (protect) {
> +		cpa.mask_set = mem_protected_bits;
> +		cpa.mask_clr = mem_plain_bits;
> +		map_type = TDX_MAP_PRIVATE;
> +	} else {
> +		cpa.mask_set = mem_plain_bits;
> +		cpa.mask_clr = mem_protected_bits;
> +		map_type = TDX_MAP_SHARED;
> +	}
> +
>   	cpa.pgd = init_mm.pgd;
>   
>   	/* Must avoid aliasing mappings in the highmem code */
> @@ -2006,9 +2026,17 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>   	vm_unmap_aliases();
>   
>   	/*
> -	 * Before changing the encryption attribute, we need to flush caches.
> +	 * Before changing the encryption attribute, flush caches.
> +	 *
> +	 * For TDX, guest is responsible for flushing caches on private->shared
> +	 * transition. VMM is responsible for flushing on shared->private.
>   	 */
> -	cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
> +	if (cc_platform_has(CC_ATTR_GUEST_TDX)) {
> +		if (map_type == TDX_MAP_SHARED)
> +			cpa_flush(&cpa, 1);
> +	} else {
> +		cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
> +	}
>   
>   	ret = __change_page_attr_set_clr(&cpa, 1);
>   
> @@ -2021,18 +2049,21 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>   	 */
>   	cpa_flush(&cpa, 0);
>   
> +	if (!ret && cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT))
> +		ret = tdx_hcall_gpa_intent(__pa(addr), numpages, map_type);
> +
>   	return ret;
>   }
>   
>   int set_memory_encrypted(unsigned long addr, int numpages)
>   {
> -	return __set_memory_enc_dec(addr, numpages, true);
> +	return __set_memory_protect(addr, numpages, true);
>   }
>   EXPORT_SYMBOL_GPL(set_memory_encrypted);
>   
>   int set_memory_decrypted(unsigned long addr, int numpages)
>   {
> -	return __set_memory_enc_dec(addr, numpages, false);
> +	return __set_memory_protect(addr, numpages, false);
>   }
>   EXPORT_SYMBOL_GPL(set_memory_decrypted);
>   
>

Kuppuswamy Sathyanarayanan Oct. 20, 2021, 4:45 p.m. UTC | #2

On 10/20/21 9:33 AM, Tom Lendacky wrote:
> On 10/8/21 7:37 PM, Kuppuswamy Sathyanarayanan wrote:
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>>
>> Just like MKTME, TDX reassigns bits of the physical address for
>> metadata.  MKTME used several bits for an encryption KeyID. TDX
>> uses a single bit in guests to communicate whether a physical page
>> should be protected by TDX as private memory (bit set to 0) or
>> unprotected and shared with the VMM (bit set to 1).
>>
>> __set_memory_enc_dec() is now aware about TDX and sets Shared bit
>> accordingly following with relevant TDX hypercall.
>>
>> Also, Do TDX_ACCEPT_PAGE on every 4k page after mapping the GPA range
>> when converting memory to private. Using 4k page size limit is due
>> to current TDX spec restriction. Also, If the GPA (range) was
>> already mapped as an active, private page, the host VMM may remove
>> the private page from the TD by following the “Removing TD Private
>> Pages” sequence in the Intel TDX-module specification [1] to safely
>> block the mapping(s), flush the TLB and cache, and remove the
>> mapping(s).
>>
>> BUG() if TDX_ACCEPT_PAGE fails (except "previously accepted page" case)
>> , as the guest is completely hosed if it can't access memory.
>>
>> [1] 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsoftware.intel.com%2Fcontent%2Fdam%2Fdevelop%2Fexternal%2Fus%2Fen%2Fdocuments%2Ftdx-module-1eas-v0.85.039.pdf&amp;data=04%7C01%7Cthomas.lendacky%40amd.com%7C0e667adf5a4042abce3908d98abd07a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637693367201703893%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=UGxQ9xBjWsmev7PetX%2BuS0RChkAXyaH7q6JHO9ZiUtY%3D&amp;reserved=0 
>>
>>
>> Tested-by: Kai Huang <kai.huang@linux.intel.com>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Reviewed-by: Andi Kleen <ak@linux.intel.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan 
>> <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> ...
> 
>> diff --git a/arch/x86/mm/mem_encrypt_common.c 
>> b/arch/x86/mm/mem_encrypt_common.c
>> index f063c885b0a5..119a9056efbb 100644
>> --- a/arch/x86/mm/mem_encrypt_common.c
>> +++ b/arch/x86/mm/mem_encrypt_common.c
>> @@ -9,9 +9,18 @@
>>   #include <asm/mem_encrypt_common.h>
>>   #include <linux/dma-mapping.h>
>> +#include <linux/cc_platform.h>
>>   /* Override for DMA direct allocation check - 
>> ARCH_HAS_FORCE_DMA_UNENCRYPTED */
>>   bool force_dma_unencrypted(struct device *dev)
>>   {
>> -    return amd_force_dma_unencrypted(dev);
>> +    if (cc_platform_has(CC_ATTR_GUEST_TDX) &&
>> +        cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
>> +        return true;
>> +
>> +    if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) ||
>> +        cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>> +        return amd_force_dma_unencrypted(dev);
>> +
>> +    return false;
> 
> Assuming the original force_dma_unencrypted() function was moved here or 
> cc_platform.c, then you shouldn't need any changes. Both SEV and TDX 
> require true be returned if CC_ATTR_GUEST_MEM_ENCRYPT returns true. And 
> then TDX should never return true for CC_ATTR_HOST_MEM_ENCRYPT.


For non TDX case, in CC_ATTR_HOST_MEM_ENCRYPT, we should still call
amd_force_dma_unencrypted() right?

> 
>>   }
>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>> index 527957586f3c..6c531d5cb5fd 100644
>> --- a/arch/x86/mm/pat/set_memory.c
>> +++ b/arch/x86/mm/pat/set_memory.c
>> @@ -30,6 +30,7 @@
>>   #include <asm/proto.h>
>>   #include <asm/memtype.h>
>>   #include <asm/set_memory.h>
>> +#include <asm/tdx.h>
>>   #include "../mm_internal.h"
>> @@ -1981,8 +1982,10 @@ int set_memory_global(unsigned long addr, int 
>> numpages)
>>                       __pgprot(_PAGE_GLOBAL), 0);
>>   }
>> -static int __set_memory_enc_dec(unsigned long addr, int numpages, 
>> bool enc)
>> +static int __set_memory_protect(unsigned long addr, int numpages, 
>> bool protect)
>>   {
>> +    pgprot_t mem_protected_bits, mem_plain_bits;
>> +    enum tdx_map_type map_type;
>>       struct cpa_data cpa;
>>       int ret;
>> @@ -1997,8 +2000,25 @@ static int __set_memory_enc_dec(unsigned long 
>> addr, int numpages, bool enc)
>>       memset(&cpa, 0, sizeof(cpa));
>>       cpa.vaddr = &addr;
>>       cpa.numpages = numpages;
>> -    cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
>> -    cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
>> +
>> +    if (cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT)) {
>> +        mem_protected_bits = __pgprot(0);
>> +        mem_plain_bits = pgprot_cc_shared_mask();
> 
> How about having generic versions for both shared and private that 
> return the proper value for SEV or TDX. Then this remains looking 
> similar to as it does now, just replacing the __pgprot() calls with the 
> appropriate pgprot_cc_{shared,private}_mask().

Makes sense.

> 
> Thanks,
> Tom
> 
>> +    } else {
>> +        mem_protected_bits = __pgprot(_PAGE_ENC);
>> +        mem_plain_bits = __pgprot(0);
>> +    }
>> +
>> +    if (protect) {
>> +        cpa.mask_set = mem_protected_bits;
>> +        cpa.mask_clr = mem_plain_bits;
>> +        map_type = TDX_MAP_PRIVATE;
>> +    } else {
>> +        cpa.mask_set = mem_plain_bits;
>> +        cpa.mask_clr = mem_protected_bits;
>> +        map_type = TDX_MAP_SHARED;
>> +    }
>> +
>>       cpa.pgd = init_mm.pgd;
>>       /* Must avoid aliasing mappings in the highmem code */
>> @@ -2006,9 +2026,17 @@ static int __set_memory_enc_dec(unsigned long 
>> addr, int numpages, bool enc)
>>       vm_unmap_aliases();
>>       /*
>> -     * Before changing the encryption attribute, we need to flush 
>> caches.
>> +     * Before changing the encryption attribute, flush caches.
>> +     *
>> +     * For TDX, guest is responsible for flushing caches on 
>> private->shared
>> +     * transition. VMM is responsible for flushing on shared->private.
>>        */
>> -    cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
>> +    if (cc_platform_has(CC_ATTR_GUEST_TDX)) {
>> +        if (map_type == TDX_MAP_SHARED)
>> +            cpa_flush(&cpa, 1);
>> +    } else {
>> +        cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
>> +    }
>>       ret = __change_page_attr_set_clr(&cpa, 1);
>> @@ -2021,18 +2049,21 @@ static int __set_memory_enc_dec(unsigned long 
>> addr, int numpages, bool enc)
>>        */
>>       cpa_flush(&cpa, 0);
>> +    if (!ret && cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT))
>> +        ret = tdx_hcall_gpa_intent(__pa(addr), numpages, map_type);
>> +
>>       return ret;
>>   }
>>   int set_memory_encrypted(unsigned long addr, int numpages)
>>   {
>> -    return __set_memory_enc_dec(addr, numpages, true);
>> +    return __set_memory_protect(addr, numpages, true);
>>   }
>>   EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>   int set_memory_decrypted(unsigned long addr, int numpages)
>>   {
>> -    return __set_memory_enc_dec(addr, numpages, false);
>> +    return __set_memory_protect(addr, numpages, false);
>>   }
>>   EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>

Tom Lendacky Oct. 20, 2021, 5:22 p.m. UTC | #3

On 10/20/21 11:45 AM, Sathyanarayanan Kuppuswamy wrote:
> On 10/20/21 9:33 AM, Tom Lendacky wrote:
>> On 10/8/21 7:37 PM, Kuppuswamy Sathyanarayanan wrote:

...

>>>   bool force_dma_unencrypted(struct device *dev)
>>>   {
>>> -    return amd_force_dma_unencrypted(dev);
>>> +    if (cc_platform_has(CC_ATTR_GUEST_TDX) &&
>>> +        cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
>>> +        return true;
>>> +
>>> +    if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) ||
>>> +        cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>>> +        return amd_force_dma_unencrypted(dev);
>>> +
>>> +    return false;
>>
>> Assuming the original force_dma_unencrypted() function was moved here or 
>> cc_platform.c, then you shouldn't need any changes. Both SEV and TDX 
>> require true be returned if CC_ATTR_GUEST_MEM_ENCRYPT returns true. And 
>> then TDX should never return true for CC_ATTR_HOST_MEM_ENCRYPT.
> 
> 
> For non TDX case, in CC_ATTR_HOST_MEM_ENCRYPT, we should still call
> amd_force_dma_unencrypted() right?

What I'm saying is that you wouldn't have amd_force_dma_unencrypted(). I 
think the whole force_dma_unencrypted() can exist as-is in a different 
file, whether that's cc_platform.c or mem_encrypt_common.c.

It will return true for an SEV or TDX guest, true for an SME host based on 
the DMA mask or else false. That should work just fine for TDX.

Thanks,
Tom

> 
>>
>>>   }
>>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>>> index 527957586f3c..6c531d5cb5fd 100644
>>> --- a/arch/x86/mm/pat/set_memory.c
>>> +++ b/arch/x86/mm/pat/set_memory.c
>>> @@ -30,6 +30,7 @@
>>>   #include <asm/proto.h>
>>>   #include <asm/memtype.h>
>>>   #include <asm/set_memory.h>
>>> +#include <asm/tdx.h>
>>>   #include "../mm_internal.h"
>>> @@ -1981,8 +1982,10 @@ int set_memory_global(unsigned long addr, int 
>>> numpages)
>>>                       __pgprot(_PAGE_GLOBAL), 0);
>>>   }
>>> -static int __set_memory_enc_dec(unsigned long addr, int numpages, bool 
>>> enc)
>>> +static int __set_memory_protect(unsigned long addr, int numpages, bool 
>>> protect)
>>>   {
>>> +    pgprot_t mem_protected_bits, mem_plain_bits;
>>> +    enum tdx_map_type map_type;
>>>       struct cpa_data cpa;
>>>       int ret;
>>> @@ -1997,8 +2000,25 @@ static int __set_memory_enc_dec(unsigned long 
>>> addr, int numpages, bool enc)
>>>       memset(&cpa, 0, sizeof(cpa));
>>>       cpa.vaddr = &addr;
>>>       cpa.numpages = numpages;
>>> -    cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
>>> -    cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
>>> +
>>> +    if (cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT)) {
>>> +        mem_protected_bits = __pgprot(0);
>>> +        mem_plain_bits = pgprot_cc_shared_mask();
>>
>> How about having generic versions for both shared and private that 
>> return the proper value for SEV or TDX. Then this remains looking 
>> similar to as it does now, just replacing the __pgprot() calls with the 
>> appropriate pgprot_cc_{shared,private}_mask().
> 
> Makes sense.
> 
>>
>> Thanks,
>> Tom
>>
>>> +    } else {
>>> +        mem_protected_bits = __pgprot(_PAGE_ENC);
>>> +        mem_plain_bits = __pgprot(0);
>>> +    }
>>> +
>>> +    if (protect) {
>>> +        cpa.mask_set = mem_protected_bits;
>>> +        cpa.mask_clr = mem_plain_bits;
>>> +        map_type = TDX_MAP_PRIVATE;
>>> +    } else {
>>> +        cpa.mask_set = mem_plain_bits;
>>> +        cpa.mask_clr = mem_protected_bits;
>>> +        map_type = TDX_MAP_SHARED;
>>> +    }
>>> +
>>>       cpa.pgd = init_mm.pgd;
>>>       /* Must avoid aliasing mappings in the highmem code */
>>> @@ -2006,9 +2026,17 @@ static int __set_memory_enc_dec(unsigned long 
>>> addr, int numpages, bool enc)
>>>       vm_unmap_aliases();
>>>       /*
>>> -     * Before changing the encryption attribute, we need to flush caches.
>>> +     * Before changing the encryption attribute, flush caches.
>>> +     *
>>> +     * For TDX, guest is responsible for flushing caches on 
>>> private->shared
>>> +     * transition. VMM is responsible for flushing on shared->private.
>>>        */
>>> -    cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
>>> +    if (cc_platform_has(CC_ATTR_GUEST_TDX)) {
>>> +        if (map_type == TDX_MAP_SHARED)
>>> +            cpa_flush(&cpa, 1);
>>> +    } else {
>>> +        cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
>>> +    }
>>>       ret = __change_page_attr_set_clr(&cpa, 1);
>>> @@ -2021,18 +2049,21 @@ static int __set_memory_enc_dec(unsigned long 
>>> addr, int numpages, bool enc)
>>>        */
>>>       cpa_flush(&cpa, 0);
>>> +    if (!ret && cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT))
>>> +        ret = tdx_hcall_gpa_intent(__pa(addr), numpages, map_type);
>>> +
>>>       return ret;
>>>   }
>>>   int set_memory_encrypted(unsigned long addr, int numpages)
>>>   {
>>> -    return __set_memory_enc_dec(addr, numpages, true);
>>> +    return __set_memory_protect(addr, numpages, true);
>>>   }
>>>   EXPORT_SYMBOL_GPL(set_memory_encrypted);
>>>   int set_memory_decrypted(unsigned long addr, int numpages)
>>>   {
>>> -    return __set_memory_enc_dec(addr, numpages, false);
>>> +    return __set_memory_protect(addr, numpages, false);
>>>   }
>>>   EXPORT_SYMBOL_GPL(set_memory_decrypted);
>>>
>

Kuppuswamy Sathyanarayanan Oct. 20, 2021, 5:26 p.m. UTC | #4

On 10/20/21 10:22 AM, Tom Lendacky wrote:
>>
>> For non TDX case, in CC_ATTR_HOST_MEM_ENCRYPT, we should still call
>> amd_force_dma_unencrypted() right?
> 
> What I'm saying is that you wouldn't have amd_force_dma_unencrypted(). I 
> think the whole force_dma_unencrypted() can exist as-is in a different 
> file, whether that's cc_platform.c or mem_encrypt_common.c.
> 
> It will return true for an SEV or TDX guest, true for an SME host based 
> on the DMA mask or else false. That should work just fine for TDX.

Got it. Thanks for clarifying it.

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index ecefccbdf2e3..2de4d6e34b84 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -24,6 +24,7 @@ 
 /* Make the page accesable by VMM for confidential guests */
 #define pgprot_cc_guest(prot) __pgprot(pgprot_val(prot) |	\
 					      tdx_shared_mask())
+#define pgprot_cc_shared_mask() __pgprot(tdx_shared_mask())
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index c3e4cc5d631b..433f366ca25c 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -16,10 +16,16 @@ 
 /* TDX Module call Leaf IDs */
 #define TDX_GET_INFO			1
 #define TDX_GET_VEINFO			3
+#define TDX_ACCEPT_PAGE			6
 
 /* TDX hypercall Leaf IDs */
 #define TDVMCALL_MAP_GPA		0x10001
 
+/* TDX Module call error codes */
+#define TDX_PAGE_ALREADY_ACCEPTED	0x00000b0a00000000
+#define TDCALL_RETURN_CODE_MASK		0xFFFFFFFF00000000
+#define TDCALL_RETURN_CODE(a)		((a) & TDCALL_RETURN_CODE_MASK)
+
 #define VE_IS_IO_OUT(exit_qual)		(((exit_qual) & 8) ? 0 : 1)
 #define VE_GET_IO_SIZE(exit_qual)	(((exit_qual) & 7) + 1)
 #define VE_GET_PORT_NUM(exit_qual)	((exit_qual) >> 16)
@@ -108,18 +114,35 @@  static void tdx_get_info(void)
 	physical_mask &= ~tdx_shared_mask();
 }
 
+static void tdx_accept_page(phys_addr_t gpa)
+{
+	u64 ret;
+
+	/*
+	 * Pass the page physical address and size (0-4KB) to the
+	 * TDX module to accept the pending, private page. More info
+	 * about ABI can be found in TDX Guest-Host-Communication
+	 * Interface (GHCI), sec 2.4.7.
+	 */
+	ret = __tdx_module_call(TDX_ACCEPT_PAGE, gpa, 0, 0, 0, NULL);
+
+	/*
+	 * Non zero return value means buggy TDX module (which is
+	 * fatal for TDX guest). So panic here.
+	 */
+	BUG_ON(ret && TDCALL_RETURN_CODE(ret) != TDX_PAGE_ALREADY_ACCEPTED);
+}
+
 /*
  * Inform the VMM of the guest's intent for this physical page:
  * shared with the VMM or private to the guest.  The VMM is
  * expected to change its mapping of the page in response.
- *
- * Note: shared->private conversions require further guest
- * action to accept the page.
  */
 int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
 			 enum tdx_map_type map_type)
 {
-	u64 ret;
+	u64 ret = 0;
+	int i;
 
 	if (map_type == TDX_MAP_SHARED)
 		gpa |= tdx_shared_mask();
@@ -131,8 +154,20 @@  int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
 	 */
 	ret = _tdx_hypercall(TDVMCALL_MAP_GPA, gpa, PAGE_SIZE * numpages, 0, 0,
 			     NULL);
+	if (ret)
+		ret = -EIO;
+
+	if (ret || map_type == TDX_MAP_SHARED)
+		return ret;
+
+	/*
+	 * For shared->private conversion, accept the page using
+	 * TDX_ACCEPT_PAGE TDX module call.
+	 */
+	for (i = 0; i < numpages; i++)
+		tdx_accept_page(gpa + i * PAGE_SIZE);
 
-	return ret ? -EIO : 0;
+	return 0;
 }
 
 static __cpuidle void _tdx_halt(const bool irq_disabled, const bool do_sti)
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
index f063c885b0a5..119a9056efbb 100644
--- a/arch/x86/mm/mem_encrypt_common.c
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -9,9 +9,18 @@ 
 
 #include <asm/mem_encrypt_common.h>
 #include <linux/dma-mapping.h>
+#include <linux/cc_platform.h>
 
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
-	return amd_force_dma_unencrypted(dev);
+	if (cc_platform_has(CC_ATTR_GUEST_TDX) &&
+	    cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+		return true;
+
+	if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) ||
+	    cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
+		return amd_force_dma_unencrypted(dev);
+
+	return false;
 }
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 527957586f3c..6c531d5cb5fd 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -30,6 +30,7 @@ 
 #include <asm/proto.h>
 #include <asm/memtype.h>
 #include <asm/set_memory.h>
+#include <asm/tdx.h>
 
 #include "../mm_internal.h"
 
@@ -1981,8 +1982,10 @@  int set_memory_global(unsigned long addr, int numpages)
 				    __pgprot(_PAGE_GLOBAL), 0);
 }
 
-static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
+static int __set_memory_protect(unsigned long addr, int numpages, bool protect)
 {
+	pgprot_t mem_protected_bits, mem_plain_bits;
+	enum tdx_map_type map_type;
 	struct cpa_data cpa;
 	int ret;
 
@@ -1997,8 +2000,25 @@  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	memset(&cpa, 0, sizeof(cpa));
 	cpa.vaddr = &addr;
 	cpa.numpages = numpages;
-	cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
-	cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
+
+	if (cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT)) {
+		mem_protected_bits = __pgprot(0);
+		mem_plain_bits = pgprot_cc_shared_mask();
+	} else {
+		mem_protected_bits = __pgprot(_PAGE_ENC);
+		mem_plain_bits = __pgprot(0);
+	}
+
+	if (protect) {
+		cpa.mask_set = mem_protected_bits;
+		cpa.mask_clr = mem_plain_bits;
+		map_type = TDX_MAP_PRIVATE;
+	} else {
+		cpa.mask_set = mem_plain_bits;
+		cpa.mask_clr = mem_protected_bits;
+		map_type = TDX_MAP_SHARED;
+	}
+
 	cpa.pgd = init_mm.pgd;
 
 	/* Must avoid aliasing mappings in the highmem code */
@@ -2006,9 +2026,17 @@  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	vm_unmap_aliases();
 
 	/*
-	 * Before changing the encryption attribute, we need to flush caches.
+	 * Before changing the encryption attribute, flush caches.
+	 *
+	 * For TDX, guest is responsible for flushing caches on private->shared
+	 * transition. VMM is responsible for flushing on shared->private.
 	 */
-	cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+	if (cc_platform_has(CC_ATTR_GUEST_TDX)) {
+		if (map_type == TDX_MAP_SHARED)
+			cpa_flush(&cpa, 1);
+	} else {
+		cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+	}
 
 	ret = __change_page_attr_set_clr(&cpa, 1);
 
@@ -2021,18 +2049,21 @@  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 	 */
 	cpa_flush(&cpa, 0);
 
+	if (!ret && cc_platform_has(CC_ATTR_GUEST_SHARED_MAPPING_INIT))
+		ret = tdx_hcall_gpa_intent(__pa(addr), numpages, map_type);
+
 	return ret;
 }
 
 int set_memory_encrypted(unsigned long addr, int numpages)
 {
-	return __set_memory_enc_dec(addr, numpages, true);
+	return __set_memory_protect(addr, numpages, true);
 }
 EXPORT_SYMBOL_GPL(set_memory_encrypted);
 
 int set_memory_decrypted(unsigned long addr, int numpages)
 {
-	return __set_memory_enc_dec(addr, numpages, false);
+	return __set_memory_protect(addr, numpages, false);
 }
 EXPORT_SYMBOL_GPL(set_memory_decrypted);

[v5,06/16] x86/tdx: Make DMA pages shared

Commit Message

Comments

Patch