From patchwork Fri Jul 26 15:21:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742896 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 40EF8C3DA70 for ; Fri, 26 Jul 2024 15:31:44 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765509.1176116 (Exim 4.92) (envelope-from ) id 1sXMur-0008Iw-94; Fri, 26 Jul 2024 15:31:33 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765509.1176116; Fri, 26 Jul 2024 15:31:33 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMur-0008Ip-6H; Fri, 26 Jul 2024 15:31:33 +0000 Received: by outflank-mailman (input) for mailman id 765509; Fri, 26 Jul 2024 15:31:31 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMup-00084Z-Dm for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:31 +0000 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [2607:f8b0:4864:20::735]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2124b10b-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:30 +0200 (CEST) Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-7a1d024f775so52880185a.2 for ; Fri, 26 Jul 2024 08:31:30 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8f83d4sm17591006d6.35.2024.07.26.08.31.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:28 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2124b10b-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007889; x=1722612689; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q8/eLzGKCNNaqp81P/ebP+yU7jUBZnj71z4rM4VxIFc=; b=qO0jKsIZ8iWXK45vr0ZvmZnaxWd/D6nA1AwpS7MPmv9BYvqhE5TCpM7ix54kTCuyXz ZuRXeH03rj0DOJtRC3hBNlKgz1RrCbhuq7V2M1KaRMnpqQ0nUiVYSPY8LU4aVWonlrs0 +fCNPguGDLb5rC1YHRqHPCvBpWSWGmESrPW6s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007889; x=1722612689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q8/eLzGKCNNaqp81P/ebP+yU7jUBZnj71z4rM4VxIFc=; b=KbPRU3K0gyfhG0UpnLN/Y22ngEwgaLOHToPz1bfYq+QnzM6RUTtknr8DcagpV1IAYS Pw1vUitt/ngcHrEkG6gOo2qjv59qepG27SSt2haqY1Z/hq8Yixew+d9aApKSM0ZdhjfT +TlCOwe5ae/mwH4brkxCpjd50y0t47qZ2GhuZv6FOCRbs/vm2gv3THSGDVt8j6rS4hFa x1Wux0T5rTsWo5CI9Wa0gzMyeN3si99Bs4TgYu7ZBdVw7vOc90cJVikxS39TRP9biK4u 06cWL3viIjfc/zVpOKEc3aTBvCmTuOwnyiSO6W/rFBUHaGQA0goSbWVba6lbaScpL+T2 4PfQ== X-Gm-Message-State: AOJu0YxMA+C+swB2Q26QClHB+8JHvY2Jim8WZIqRQFQPsvBG0U7E++Fg SpcVAxpr3pY43e/MfigfgC+KkKke/yxTUNxyYydKC2uIZbLwBFnKOK4tIBoyeHaJVGYchkinvR2 l X-Google-Smtp-Source: AGHT+IFLtsWsY1JYEGzPRoLeoM/bqF2ayyF64j8qdM3N2504+XrbCNnIhwNYx2hvnFRpCWiQHayJQw== X-Received: by 2002:a05:6214:4106:b0:6b5:423:52bc with SMTP id 6a1803df08f44-6bb559f8dd9mr1901106d6.3.1722007888414; Fri, 26 Jul 2024 08:31:28 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 01/22] x86/mm: drop l{1,2,3,4}e_write_atomic() Date: Fri, 26 Jul 2024 17:21:45 +0200 Message-ID: <20240726152206.28411-2-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 The l{1,2,3,4}e_write_atomic() and non _atomic suffixed helpers share the same implementation, so it seems pointless and possibly confusing to have both. Remove the l{1,2,3,4}e_write_atomic() helpers and switch it's user to l{1,2,3,4}e_write(), as that's also atomic. While there also remove pte_write{,_atomic}() and just use write_atomic() in the wrappers. No functional change intended. Signed-off-by: Roger Pau Monné Reviewed-by: Jan Beulich --- xen/arch/x86/include/asm/page.h | 21 +++----------- xen/arch/x86/include/asm/x86_64/page.h | 2 -- xen/arch/x86/mm.c | 39 +++++++++++--------------- 3 files changed, 20 insertions(+), 42 deletions(-) diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h index 350d1fb1100f..3d20ee507a33 100644 --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -26,27 +26,14 @@ l4e_from_intpte(pte_read_atomic(&l4e_get_intpte(*(l4ep)))) /* Write a pte atomically to memory. */ -#define l1e_write_atomic(l1ep, l1e) \ - pte_write_atomic(&l1e_get_intpte(*(l1ep)), l1e_get_intpte(l1e)) -#define l2e_write_atomic(l2ep, l2e) \ - pte_write_atomic(&l2e_get_intpte(*(l2ep)), l2e_get_intpte(l2e)) -#define l3e_write_atomic(l3ep, l3e) \ - pte_write_atomic(&l3e_get_intpte(*(l3ep)), l3e_get_intpte(l3e)) -#define l4e_write_atomic(l4ep, l4e) \ - pte_write_atomic(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e)) - -/* - * Write a pte safely but non-atomically to memory. - * The PTE may become temporarily not-present during the update. - */ #define l1e_write(l1ep, l1e) \ - pte_write(&l1e_get_intpte(*(l1ep)), l1e_get_intpte(l1e)) + write_atomic(&l1e_get_intpte(*(l1ep)), l1e_get_intpte(l1e)) #define l2e_write(l2ep, l2e) \ - pte_write(&l2e_get_intpte(*(l2ep)), l2e_get_intpte(l2e)) + write_atomic(&l2e_get_intpte(*(l2ep)), l2e_get_intpte(l2e)) #define l3e_write(l3ep, l3e) \ - pte_write(&l3e_get_intpte(*(l3ep)), l3e_get_intpte(l3e)) + write_atomic(&l3e_get_intpte(*(l3ep)), l3e_get_intpte(l3e)) #define l4e_write(l4ep, l4e) \ - pte_write(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e)) + write_atomic(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e)) /* Get direct integer representation of a pte's contents (intpte_t). */ #define l1e_get_intpte(x) ((x).l1) diff --git a/xen/arch/x86/include/asm/x86_64/page.h b/xen/arch/x86/include/asm/x86_64/page.h index 19ca64d79223..03fcce61c052 100644 --- a/xen/arch/x86/include/asm/x86_64/page.h +++ b/xen/arch/x86/include/asm/x86_64/page.h @@ -70,8 +70,6 @@ typedef l4_pgentry_t root_pgentry_t; #endif /* !__ASSEMBLY__ */ #define pte_read_atomic(ptep) read_atomic(ptep) -#define pte_write_atomic(ptep, pte) write_atomic(ptep, pte) -#define pte_write(ptep, pte) write_atomic(ptep, pte) /* Given a virtual address, get an entry offset into a linear page table. */ #define l1_linear_offset(_a) (((_a) & VADDR_MASK) >> L1_PAGETABLE_SHIFT) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 95795567f2a5..fab2de5fae27 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5253,7 +5253,7 @@ int map_pages_to_xen( !(flags & (_PAGE_PAT | MAP_SMALL_PAGES)) ) { /* 1GB-page mapping. */ - l3e_write_atomic(pl3e, l3e_from_mfn(mfn, l1f_to_lNf(flags))); + l3e_write(pl3e, l3e_from_mfn(mfn, l1f_to_lNf(flags))); if ( (l3e_get_flags(ol3e) & _PAGE_PRESENT) ) { @@ -5353,8 +5353,7 @@ int map_pages_to_xen( if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && (l3e_get_flags(*pl3e) & _PAGE_PSE) ) { - l3e_write_atomic(pl3e, - l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR)); + l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR)); l2mfn = INVALID_MFN; } if ( locking ) @@ -5375,7 +5374,7 @@ int map_pages_to_xen( { /* Super-page mapping. */ ol2e = *pl2e; - l2e_write_atomic(pl2e, l2e_from_mfn(mfn, l1f_to_lNf(flags))); + l2e_write(pl2e, l2e_from_mfn(mfn, l1f_to_lNf(flags))); if ( (l2e_get_flags(ol2e) & _PAGE_PRESENT) ) { @@ -5457,8 +5456,7 @@ int map_pages_to_xen( if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) && (l2e_get_flags(*pl2e) & _PAGE_PSE) ) { - l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn, - __PAGE_HYPERVISOR)); + l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR)); l1mfn = INVALID_MFN; } if ( locking ) @@ -5471,7 +5469,7 @@ int map_pages_to_xen( if ( !pl1e ) pl1e = map_l1t_from_l2e(*pl2e) + l1_table_offset(virt); ol1e = *pl1e; - l1e_write_atomic(pl1e, l1e_from_mfn(mfn, flags)); + l1e_write(pl1e, l1e_from_mfn(mfn, flags)); UNMAP_DOMAIN_PAGE(pl1e); if ( (l1e_get_flags(ol1e) & _PAGE_PRESENT) ) { @@ -5524,8 +5522,7 @@ int map_pages_to_xen( UNMAP_DOMAIN_PAGE(l1t); if ( i == L1_PAGETABLE_ENTRIES ) { - l2e_write_atomic(pl2e, l2e_from_pfn(base_mfn, - l1f_to_lNf(flags))); + l2e_write(pl2e, l2e_from_pfn(base_mfn, l1f_to_lNf(flags))); if ( locking ) spin_unlock(&map_pgdir_lock); flush_area(virt - PAGE_SIZE, @@ -5574,8 +5571,7 @@ int map_pages_to_xen( UNMAP_DOMAIN_PAGE(l2t); if ( i == L2_PAGETABLE_ENTRIES ) { - l3e_write_atomic(pl3e, l3e_from_pfn(base_mfn, - l1f_to_lNf(flags))); + l3e_write(pl3e, l3e_from_pfn(base_mfn, l1f_to_lNf(flags))); if ( locking ) spin_unlock(&map_pgdir_lock); flush_area(virt - PAGE_SIZE, @@ -5674,7 +5670,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) : l3e_from_pfn(l3e_get_pfn(*pl3e), (l3e_get_flags(*pl3e) & ~FLAGS_MASK) | nf); - l3e_write_atomic(pl3e, nl3e); + l3e_write(pl3e, nl3e); v += 1UL << L3_PAGETABLE_SHIFT; continue; } @@ -5696,8 +5692,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) if ( (l3e_get_flags(*pl3e) & _PAGE_PRESENT) && (l3e_get_flags(*pl3e) & _PAGE_PSE) ) { - l3e_write_atomic(pl3e, - l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR)); + l3e_write(pl3e, l3e_from_mfn(l2mfn, __PAGE_HYPERVISOR)); l2mfn = INVALID_MFN; } if ( locking ) @@ -5732,7 +5727,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) : l2e_from_pfn(l2e_get_pfn(*pl2e), (l2e_get_flags(*pl2e) & ~FLAGS_MASK) | nf); - l2e_write_atomic(pl2e, nl2e); + l2e_write(pl2e, nl2e); v += 1UL << L2_PAGETABLE_SHIFT; } else @@ -5755,8 +5750,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) if ( (l2e_get_flags(*pl2e) & _PAGE_PRESENT) && (l2e_get_flags(*pl2e) & _PAGE_PSE) ) { - l2e_write_atomic(pl2e, l2e_from_mfn(l1mfn, - __PAGE_HYPERVISOR)); + l2e_write(pl2e, l2e_from_mfn(l1mfn, __PAGE_HYPERVISOR)); l1mfn = INVALID_MFN; } if ( locking ) @@ -5785,7 +5779,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) : l1e_from_pfn(l1e_get_pfn(*pl1e), (l1e_get_flags(*pl1e) & ~FLAGS_MASK) | nf); - l1e_write_atomic(pl1e, nl1e); + l1e_write(pl1e, nl1e); UNMAP_DOMAIN_PAGE(pl1e); v += PAGE_SIZE; @@ -5824,7 +5818,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) if ( i == L1_PAGETABLE_ENTRIES ) { /* Empty: zap the L2E and free the L1 page. */ - l2e_write_atomic(pl2e, l2e_empty()); + l2e_write(pl2e, l2e_empty()); if ( locking ) spin_unlock(&map_pgdir_lock); flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ @@ -5868,7 +5862,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) if ( i == L2_PAGETABLE_ENTRIES ) { /* Empty: zap the L3E and free the L2 page. */ - l3e_write_atomic(pl3e, l3e_empty()); + l3e_write(pl3e, l3e_empty()); if ( locking ) spin_unlock(&map_pgdir_lock); flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ @@ -5940,7 +5934,7 @@ void init_or_livepatch modify_xen_mappings_lite( { ASSERT(IS_ALIGNED(v, 1UL << L2_PAGETABLE_SHIFT)); - l2e_write_atomic(pl2e, l2e_from_intpte((l2e.l2 & ~fm) | flags)); + l2e_write(pl2e, l2e_from_intpte((l2e.l2 & ~fm) | flags)); v += 1UL << L2_PAGETABLE_SHIFT; continue; @@ -5958,8 +5952,7 @@ void init_or_livepatch modify_xen_mappings_lite( ASSERT(l1f & _PAGE_PRESENT); - l1e_write_atomic(pl1e, - l1e_from_intpte((l1e.l1 & ~fm) | flags)); + l1e_write(pl1e, l1e_from_intpte((l1e.l1 & ~fm) | flags)); v += 1UL << L1_PAGETABLE_SHIFT; From patchwork Fri Jul 26 15:21:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81C33C3DA49 for ; Fri, 26 Jul 2024 15:31:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765510.1176127 (Exim 4.92) (envelope-from ) id 1sXMus-00005t-H5; Fri, 26 Jul 2024 15:31:34 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765510.1176127; Fri, 26 Jul 2024 15:31:34 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMus-00005k-Dk; Fri, 26 Jul 2024 15:31:34 +0000 Received: by outflank-mailman (input) for mailman id 765510; Fri, 26 Jul 2024 15:31:34 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMus-00084T-2x for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:34 +0000 Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [2607:f8b0:4864:20::730]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 225b2650-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:32 +0200 (CEST) Received: by mail-qk1-x730.google.com with SMTP id af79cd13be357-7a1dea79e1aso46018885a.1 for ; Fri, 26 Jul 2024 08:31:32 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73b6030sm184648085a.52.2024.07.26.08.31.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:30 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 225b2650-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007891; x=1722612691; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rKymiwpMPZJ5/hzGvo04irbMOh4ckmWGExUi8izxgEY=; b=g2CZh8PzJVV6cSccLzcfZao96z+rrxvlhV/jpqKyjIwTYzWn5GNoDkKHugM3d1P8j1 lrwSyBWFFNxJi8Jpff7C/MrhR5XtaJUe36O5gVmv4+WHywNE0swkG3eHDnrOVOq+lit0 u5E225yvs1PGMM2skBwlh3CGPKStapi7dFUj4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007891; x=1722612691; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rKymiwpMPZJ5/hzGvo04irbMOh4ckmWGExUi8izxgEY=; b=vSdKYiFYkWjuXCI8mzRNXYK1CEZSjuUitWYsH/SyxTIVhnfIOh5H14vJP2cfsBc6jU KjIXu6ll/smE4u1LEax5IwHKWOdM08RbIzfEB9p6flTTAzmdMybxOhihOQlwFEHoRTRR 0LzIRCRj2ELUUadJJGWMWQoRkulQ6Do3dZ55QzAJujmfXFrDi7WdQEeOTWXXSIiLa6U2 77CD+Xy+7jWC/3sVOmyHwsAY//yAZ/yfM2Ri2V04ow/vrH5IGHZzHJcFpt6Jf7qmQNtj L61/C7ka8lZbPSQIE8n16+0rqqpBB31YRO5iorVDyJfoyTbEx+jaZU+Xxt8Whse0AIQr W6YA== X-Gm-Message-State: AOJu0Yz7d048j2LP1tJ34vJ0VJG5C/2ibTLtgthsvBTI5UYbTuGseBja iinc3ANkEJIvyj7TB9xBCFnybJcFP+zIOzcxR5//lccavS8h5nZdeNqZhV1y9NeC4n4FNPZjJ94 G X-Google-Smtp-Source: AGHT+IEdUO0MI9dsyDOfxgyck3T6p5fTlRG7Ie+Y+ECT2UK0qouUz1Bg+vc3njuND80gxH1JaE/PaQ== X-Received: by 2002:a05:620a:4013:b0:79f:17af:e360 with SMTP id af79cd13be357-7a1e525e9c4mr11217485a.33.1722007890580; Fri, 26 Jul 2024 08:31:30 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 02/22] x86/mm: rename l{1,2,3,4}e_read_atomic() Date: Fri, 26 Jul 2024 17:21:46 +0200 Message-ID: <20240726152206.28411-3-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 There's no l{1,2,3,4}e_read() implementation, so drop the _atomic suffix from the read helpers. This allows unifying the naming with the write helpers, which are also atomic but don't have the suffix already: l{1,2,3,4}e_write(). No functional change intended. Signed-off-by: Roger Pau Monné Reviewed-by: Jan Beulich --- xen/arch/x86/include/asm/page.h | 16 ++++++++-------- xen/arch/x86/include/asm/x86_64/page.h | 2 -- xen/arch/x86/mm.c | 12 ++++++------ xen/arch/x86/traps.c | 8 ++++---- 4 files changed, 18 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h index 3d20ee507a33..e48571de9332 100644 --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -16,14 +16,14 @@ #include /* Read a pte atomically from memory. */ -#define l1e_read_atomic(l1ep) \ - l1e_from_intpte(pte_read_atomic(&l1e_get_intpte(*(l1ep)))) -#define l2e_read_atomic(l2ep) \ - l2e_from_intpte(pte_read_atomic(&l2e_get_intpte(*(l2ep)))) -#define l3e_read_atomic(l3ep) \ - l3e_from_intpte(pte_read_atomic(&l3e_get_intpte(*(l3ep)))) -#define l4e_read_atomic(l4ep) \ - l4e_from_intpte(pte_read_atomic(&l4e_get_intpte(*(l4ep)))) +#define l1e_read(l1ep) \ + l1e_from_intpte(read_atomic(&l1e_get_intpte(*(l1ep)))) +#define l2e_read(l2ep) \ + l2e_from_intpte(read_atomic(&l2e_get_intpte(*(l2ep)))) +#define l3e_read(l3ep) \ + l3e_from_intpte(read_atomic(&l3e_get_intpte(*(l3ep)))) +#define l4e_read(l4ep) \ + l4e_from_intpte(read_atomic(&l4e_get_intpte(*(l4ep)))) /* Write a pte atomically to memory. */ #define l1e_write(l1ep, l1e) \ diff --git a/xen/arch/x86/include/asm/x86_64/page.h b/xen/arch/x86/include/asm/x86_64/page.h index 03fcce61c052..465a70731214 100644 --- a/xen/arch/x86/include/asm/x86_64/page.h +++ b/xen/arch/x86/include/asm/x86_64/page.h @@ -69,8 +69,6 @@ typedef l4_pgentry_t root_pgentry_t; #endif /* !__ASSEMBLY__ */ -#define pte_read_atomic(ptep) read_atomic(ptep) - /* Given a virtual address, get an entry offset into a linear page table. */ #define l1_linear_offset(_a) (((_a) & VADDR_MASK) >> L1_PAGETABLE_SHIFT) #define l2_linear_offset(_a) (((_a) & VADDR_MASK) >> L2_PAGETABLE_SHIFT) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index fab2de5fae27..6ffacab341ad 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -2147,7 +2147,7 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e, struct vcpu *pt_vcpu, struct domain *pg_dom) { bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD); - l1_pgentry_t ol1e = l1e_read_atomic(pl1e); + l1_pgentry_t ol1e = l1e_read(pl1e); struct domain *pt_dom = pt_vcpu->domain; int rc = 0; @@ -2270,7 +2270,7 @@ static int mod_l2_entry(l2_pgentry_t *pl2e, return -EPERM; } - ol2e = l2e_read_atomic(pl2e); + ol2e = l2e_read(pl2e); if ( l2e_get_flags(nl2e) & _PAGE_PRESENT ) { @@ -2332,7 +2332,7 @@ static int mod_l3_entry(l3_pgentry_t *pl3e, if ( pgentry_ptr_to_slot(pl3e) >= 3 && is_pv_32bit_domain(d) ) return -EINVAL; - ol3e = l3e_read_atomic(pl3e); + ol3e = l3e_read(pl3e); if ( l3e_get_flags(nl3e) & _PAGE_PRESENT ) { @@ -2394,7 +2394,7 @@ static int mod_l4_entry(l4_pgentry_t *pl4e, return -EINVAL; } - ol4e = l4e_read_atomic(pl4e); + ol4e = l4e_read(pl4e); if ( l4e_get_flags(nl4e) & _PAGE_PRESENT ) { @@ -5925,7 +5925,7 @@ void init_or_livepatch modify_xen_mappings_lite( while ( v < e ) { l2_pgentry_t *pl2e = &l2_xenmap[l2_table_offset(v)]; - l2_pgentry_t l2e = l2e_read_atomic(pl2e); + l2_pgentry_t l2e = l2e_read(pl2e); unsigned int l2f = l2e_get_flags(l2e); ASSERT(l2f & _PAGE_PRESENT); @@ -5947,7 +5947,7 @@ void init_or_livepatch modify_xen_mappings_lite( while ( v < e ) { l1_pgentry_t *pl1e = &pl1t[l1_table_offset(v)]; - l1_pgentry_t l1e = l1e_read_atomic(pl1e); + l1_pgentry_t l1e = l1e_read(pl1e); unsigned int l1f = l1e_get_flags(l1e); ASSERT(l1f & _PAGE_PRESENT); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index ee91fc56b125..b4fb95917023 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1450,7 +1450,7 @@ static enum pf_type __page_fault_type(unsigned long addr, mfn = cr3 >> PAGE_SHIFT; l4t = map_domain_page(_mfn(mfn)); - l4e = l4e_read_atomic(&l4t[l4_table_offset(addr)]); + l4e = l4e_read(&l4t[l4_table_offset(addr)]); mfn = l4e_get_pfn(l4e); unmap_domain_page(l4t); if ( ((l4e_get_flags(l4e) & required_flags) != required_flags) || @@ -1459,7 +1459,7 @@ static enum pf_type __page_fault_type(unsigned long addr, page_user &= l4e_get_flags(l4e); l3t = map_domain_page(_mfn(mfn)); - l3e = l3e_read_atomic(&l3t[l3_table_offset(addr)]); + l3e = l3e_read(&l3t[l3_table_offset(addr)]); mfn = l3e_get_pfn(l3e); unmap_domain_page(l3t); if ( ((l3e_get_flags(l3e) & required_flags) != required_flags) || @@ -1470,7 +1470,7 @@ static enum pf_type __page_fault_type(unsigned long addr, goto leaf; l2t = map_domain_page(_mfn(mfn)); - l2e = l2e_read_atomic(&l2t[l2_table_offset(addr)]); + l2e = l2e_read(&l2t[l2_table_offset(addr)]); mfn = l2e_get_pfn(l2e); unmap_domain_page(l2t); if ( ((l2e_get_flags(l2e) & required_flags) != required_flags) || @@ -1481,7 +1481,7 @@ static enum pf_type __page_fault_type(unsigned long addr, goto leaf; l1t = map_domain_page(_mfn(mfn)); - l1e = l1e_read_atomic(&l1t[l1_table_offset(addr)]); + l1e = l1e_read(&l1t[l1_table_offset(addr)]); mfn = l1e_get_pfn(l1e); unmap_domain_page(l1t); if ( ((l1e_get_flags(l1e) & required_flags) != required_flags) || From patchwork Fri Jul 26 15:21:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15CBBC3DA4A for ; Fri, 26 Jul 2024 15:31:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765511.1176137 (Exim 4.92) (envelope-from ) id 1sXMut-0000ME-RL; Fri, 26 Jul 2024 15:31:35 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765511.1176137; Fri, 26 Jul 2024 15:31:35 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMut-0000M7-Og; Fri, 26 Jul 2024 15:31:35 +0000 Received: by outflank-mailman (input) for mailman id 765511; Fri, 26 Jul 2024 15:31:35 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMut-00084Z-0f for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:35 +0000 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [2607:f8b0:4864:20::72e]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 23a1d1c0-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:34 +0200 (CEST) Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-7a1d066a5daso54328585a.3 for ; Fri, 26 Jul 2024 08:31:34 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d7435509sm186366385a.84.2024.07.26.08.31.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:32 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 23a1d1c0-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007893; x=1722612693; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MrZv9tGqnq3TAI08DUcqWZ+fCYOfSGrET03dF6dNqU8=; b=LWAZeWxM1RrL3oPCtb/SlPeV3IUT8Y/eIxtOTRRWgRsIs2h/NFMsJ/NoAfJyAfwAl/ GNC+SCQBOzU0VCS/Usu+txSi5OnEoyFHEr66tLliUmOrttD6sR6FWW909sKdNohckBMD /K47COWDH369FaK/nfYYjHjx0pqooqmpwWKrU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007893; x=1722612693; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MrZv9tGqnq3TAI08DUcqWZ+fCYOfSGrET03dF6dNqU8=; b=bYyhwcTbZJq75L364o62P+3fTbw8nzO9PjjG2jtuG3B8fMliTpZ2t5lOKJ4ELqfu12 4UFw+i6uNpVlSJzeLk4bzNLIhSBqd6F5dLdiv4MW+RQa3UrYH1QzuKOBp0ex/51FCqPC GyUtexGnPAyA3fJ7m1F2hOjNPHR774qRuM7umFzP/cLDpOhvhtBe44sc2kogBhWicLbb NQB3EoldSSZHeZHxRXjZADxyDWcP0tG8Yo/JpMHJp1K76/36zbSl9zYSvJ9mk1TYxtC9 ZP3d3/xOLL2B+JAPcpxZutO7J39E+h480bl17i3n0ITO3vRlhbBE9/geK24D8esWPW8H hJ1g== X-Gm-Message-State: AOJu0YweFinyxLbWH0nRkJlFTkPVGVnqAfJ4WWalaYdOclFBB+rKucGG KYweSxPfOfl0zjkCKoQSIhI+HnvuQbvQxQNjyVE1GvIRGYYbvbIljSGOGeDPdNX6KAz9mKBLvmH Z X-Google-Smtp-Source: AGHT+IGyDtkAcj5IZPqhstlPoxGGp6hPgDdq+Y1SNHkmt0iu6GWc2+FI5Kc9gUzSMVGbOCdn/uQB2g== X-Received: by 2002:a05:620a:f11:b0:79f:1869:11fe with SMTP id af79cd13be357-7a1e52b5ecbmr6452485a.52.1722007892781; Fri, 26 Jul 2024 08:31:32 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 03/22] x86/dom0: only disable SMAP for the PV dom0 build Date: Fri, 26 Jul 2024 17:21:47 +0200 Message-ID: <20240726152206.28411-4-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 The PVH dom0 builder doesn't switch page tables and has no need to run with SMAP disabled. Put the SMAP disabling close to the code region where it's necessary, as it then becomes obvious why switch_cr3_cr4() is required instead of write_ptbase(). Note removing SMAP from cr4_pv32_mask is not required, as we never jump into guest context, and hence updating the value of cr4_pv32_mask is not relevant. Signed-off-by: Roger Pau Monné Reviewed-by: Jan Beulich --- xen/arch/x86/pv/dom0_build.c | 13 ++++++++++--- xen/arch/x86/setup.c | 17 ----------------- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index d8043fa58a27..41772dbe80bf 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -370,6 +370,7 @@ int __init dom0_construct_pv(struct domain *d, unsigned long alloc_epfn; unsigned long initrd_pfn = -1, initrd_mfn = 0; unsigned long count; + unsigned long cr4; struct page_info *page = NULL; unsigned int flush_flags = 0; start_info_t *si; @@ -814,8 +815,14 @@ int __init dom0_construct_pv(struct domain *d, /* Set up CR3 value for switch_cr3_cr4(). */ update_cr3(v); + /* + * Temporarily clear SMAP in CR4 to allow user-accesses when running with + * the dom0 page-tables. Cache the value of CR4 so it can be restored. + */ + cr4 = read_cr4(); + /* We run on dom0's page tables for the final part of the build process. */ - switch_cr3_cr4(cr3_pa(v->arch.cr3), read_cr4()); + switch_cr3_cr4(cr3_pa(v->arch.cr3), cr4 & ~X86_CR4_SMAP); mapcache_override_current(v); /* Copy the OS image and free temporary buffer. */ @@ -836,7 +843,7 @@ int __init dom0_construct_pv(struct domain *d, (parms.virt_hypercall >= v_end) ) { mapcache_override_current(NULL); - switch_cr3_cr4(current->arch.cr3, read_cr4()); + switch_cr3_cr4(current->arch.cr3, cr4); printk("Invalid HYPERCALL_PAGE field in ELF notes.\n"); return -EINVAL; } @@ -978,7 +985,7 @@ int __init dom0_construct_pv(struct domain *d, /* Return to idle domain's page tables. */ mapcache_override_current(NULL); - switch_cr3_cr4(current->arch.cr3, read_cr4()); + switch_cr3_cr4(current->arch.cr3, cr4); update_domain_wallclock_time(d); diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index eee20bb1753c..bc387d96b519 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -955,26 +955,9 @@ static struct domain *__init create_dom0(const module_t *image, } } - /* - * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0(). - * This saves a large number of corner cases interactions with - * copy_from_user(). - */ - if ( cpu_has_smap ) - { - cr4_pv32_mask &= ~X86_CR4_SMAP; - write_cr4(read_cr4() & ~X86_CR4_SMAP); - } - if ( construct_dom0(d, image, headroom, initrd, cmdline) != 0 ) panic("Could not construct domain 0\n"); - if ( cpu_has_smap ) - { - write_cr4(read_cr4() | X86_CR4_SMAP); - cr4_pv32_mask |= X86_CR4_SMAP; - } - return d; } From patchwork Fri Jul 26 15:21:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742897 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5E110C3DA7E for ; Fri, 26 Jul 2024 15:31:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765512.1176148 (Exim 4.92) (envelope-from ) id 1sXMux-0000fB-4r; Fri, 26 Jul 2024 15:31:39 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765512.1176148; Fri, 26 Jul 2024 15:31:39 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMuw-0000f4-W1; Fri, 26 Jul 2024 15:31:38 +0000 Received: by outflank-mailman (input) for mailman id 765512; Fri, 26 Jul 2024 15:31:38 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMuv-00084Z-Uf for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:37 +0000 Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [2607:f8b0:4864:20::330]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 255233ec-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:37 +0200 (CEST) Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-708adad61f8so726198a34.1 for ; Fri, 26 Jul 2024 08:31:37 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3fab9ba9sm17151416d6.104.2024.07.26.08.31.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:35 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 255233ec-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007896; x=1722612696; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1D+8OpNgSxikEbAcjcwMvwV6ssnWzWasPU4PrnTdBEE=; b=dPbGo67K6EHMyH5XkjAL+SxkZOBlja+6K/vMPo+pLiHgVrNa+G3Eztrs/9oADFcDt9 O0NSnM3/sXXCBxZsP8hdhC3hvsvnaUXB96nW3o6yTBkjfBeZWMhNENTbQDkGfSxi4DCl xDClQ8PQ1bwDGfAMbT8bzSY2Onx/fkNQAyb3w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007896; x=1722612696; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1D+8OpNgSxikEbAcjcwMvwV6ssnWzWasPU4PrnTdBEE=; b=QYob78A8z1kjBMTTun1EjXofIms9B/jMSEkIHdj6VYiLLb4iG1kDYLfDy77kSDrQc7 +rqdYighj+2vWykFD1EFnWeikLA1sRlqRPFjA/9DcqTgqdzPvAZTo6CoFrBObDUgv/n1 hs7IQ5RygM4bBQpv2XECFoRwdb9555MTDWTQFZndnPtcTKJl22EdVJl59YQLFafr0hjj IFuoOd8dp2Ccln/xDNOVfES4h4BgpEy86IrKS4LbghgsR8P+TkZdiXm9YHeUUwTd/Q5j AkT3ScGiHt2gy/zLdpFqF3XKkBYwmCSebo2dV/qH898ZTZsMb38BV3C6Xpyy/Oo1Ea4V UNaQ== X-Gm-Message-State: AOJu0YygVTaB+kEGCk7lhIIYGx6C34eT/8Gzie/HYhmCMTG5RI/7xne2 H0oIIUpvdjx8MvjdEzBqoo92ckFcmpDdEhR73JAJqkG8Dx/1OmybxXGFNIjn9UFGDIq8DxySU+r J X-Google-Smtp-Source: AGHT+IGW6Jn234GYX3zr224VkCf1lMsflVoxkzb2acZ/LDiAqdHCiEKnBIOIZJqcFAakpcNykPeUFw== X-Received: by 2002:a05:6830:6386:b0:703:5db8:805 with SMTP id 46e09a7af769-7093210b60dmr7311216a34.4.1722007895551; Fri, 26 Jul 2024 08:31:35 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 04/22] x86/mm: ensure L4 idle_pg_table is not modified past boot Date: Fri, 26 Jul 2024 17:21:48 +0200 Message-ID: <20240726152206.28411-5-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence it shouldn't be modified once further L4 are created. Signed-off-by: Roger Pau Monné --- xen/arch/x86/mm.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 6ffacab341ad..01380fd82c9d 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) mfn_t l3mfn; l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); + /* + * dom0 is build at smp_boot, at which point we already create new L4s + * based on idle_pg_table. + */ + BUG_ON(system_state >= SYS_STATE_smp_boot); + if ( !l3t ) return NULL; UNMAP_DOMAIN_PAGE(l3t); From patchwork Fri Jul 26 15:21:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECA74C3DA70 for ; Fri, 26 Jul 2024 15:31:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765513.1176157 (Exim 4.92) (envelope-from ) id 1sXMv0-0000yO-Ar; Fri, 26 Jul 2024 15:31:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765513.1176157; Fri, 26 Jul 2024 15:31:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv0-0000yD-7y; Fri, 26 Jul 2024 15:31:42 +0000 Received: by outflank-mailman (input) for mailman id 765513; Fri, 26 Jul 2024 15:31:41 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMuz-00084T-50 for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:41 +0000 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [2607:f8b0:4864:20::f34]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 26aa0850-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:39 +0200 (CEST) Received: by mail-qv1-xf34.google.com with SMTP id 6a1803df08f44-6b97097f7fdso5314326d6.0 for ; Fri, 26 Jul 2024 08:31:39 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8d81e2sm17439186d6.25.2024.07.26.08.31.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:37 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 26aa0850-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007898; x=1722612698; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JmNhqru4TlGrw3ILyEg6ph0OyPR73zD1eJAHzMTaW8w=; b=YwoJa+jXdGxX2gahklZ0pc6vJclaPRbfY1eMfduE3W5Spgv0LIpt7b4HsmTm34mevH AzMTN0yfDAKiIvx5g5/XEsRqXKto1SX/iqLlxCLL8vQ9euysukwra9Iu48Ox169BJVhN mublEqQUz2ZYTBm0c47Riq+QYuz1WBUvsbuvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007898; x=1722612698; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JmNhqru4TlGrw3ILyEg6ph0OyPR73zD1eJAHzMTaW8w=; b=igfEBfUTg3BwnkYca0Ws0FztKrpBMm4L55APlsFFeO2pS/MUIyZeNYOwhGz9/qtgQ5 f8WmKCiWhi2vjDfP5j+FePrLVX+uWFFDajaTlKVs7Rq+eVWRvAuXKONDdnKIowQAKitG LSRbGqUBvWTWh5goeg141jQrB6bawRKhjnP0bvfz1R0Iwee+ff57HTXT0Nzis/CTFLXj tF0lRBc35t7cjuxSZkzpEXMuP9VORSiU8N/jQzfP1vLtlANosDgDDqxQeQIFkVR5CEwz JxdrzLbDuwQ3Nz61QeSqfK/+3KK645zp6w3e2w62yiVOaoRoTZlI5Zlh9sZ89jxiSd85 dNUQ== X-Gm-Message-State: AOJu0YzRx+qaztXb7FHQ/i6PSxfhWJt1E2kKXbJf5dQGGZN6HyQ9l2aB /0A4qhUY1Z9gwIEgfXI26T4jr0f/av+MvsVeI4nq0/KlYmsszS8DsdcNZXPY8vfe189HvIZc6CW e X-Google-Smtp-Source: AGHT+IFVo7RCWrzdGCf8WbHKXJS7FPJoMhkuGB/902ihUsT3Fi3KBTh0NjvB6zh1fvY99h7U1GZriw== X-Received: by 2002:a05:6214:62e:b0:6b5:e403:4418 with SMTP id 6a1803df08f44-6bb5597d56dmr1948886d6.10.1722007897691; Fri, 26 Jul 2024 08:31:37 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 05/22] x86/mm: make virt_to_xen_l1e() static Date: Fri, 26 Jul 2024 17:21:49 +0200 Message-ID: <20240726152206.28411-6-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 There are no callers outside the translation unit where it's defined, so make the function static. No functional change intended. Signed-off-by: Roger Pau Monné Acked-by: Andrew Cooper --- xen/arch/x86/include/asm/mm.h | 2 -- xen/arch/x86/mm.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 98b66edaca5e..b3853ae734fa 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -567,8 +567,6 @@ mfn_t alloc_xen_pagetable(void); void free_xen_pagetable(mfn_t mfn); void *alloc_mapped_pagetable(mfn_t *pmfn); -l1_pgentry_t *virt_to_xen_l1e(unsigned long v); - int __sync_local_execstate(void); /* Arch-specific portion of memory_op hypercall. */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 01380fd82c9d..ca3d116b0e05 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5087,7 +5087,7 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v) return map_l2t_from_l3e(l3e) + l2_table_offset(v); } -l1_pgentry_t *virt_to_xen_l1e(unsigned long v) +static l1_pgentry_t *virt_to_xen_l1e(unsigned long v) { l2_pgentry_t *pl2e, l2e; From patchwork Fri Jul 26 15:21:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF012C3DA49 for ; Fri, 26 Jul 2024 15:31:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765514.1176167 (Exim 4.92) (envelope-from ) id 1sXMv1-0001Fj-LP; Fri, 26 Jul 2024 15:31:43 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765514.1176167; Fri, 26 Jul 2024 15:31:43 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv1-0001Ef-Hv; Fri, 26 Jul 2024 15:31:43 +0000 Received: by outflank-mailman (input) for mailman id 765514; Fri, 26 Jul 2024 15:31:43 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv1-00084T-1n for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:43 +0000 Received: from mail-qk1-x736.google.com (mail-qk1-x736.google.com [2607:f8b0:4864:20::736]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 27d047fb-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:41 +0200 (CEST) Received: by mail-qk1-x736.google.com with SMTP id af79cd13be357-7a1d42da3baso57124785a.1 for ; Fri, 26 Jul 2024 08:31:41 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe820158esm14064471cf.73.2024.07.26.08.31.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:39 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 27d047fb-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007900; x=1722612700; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0xB8CrUuvcSfc2nlcH8MxzDzMULu8CCxjZFQajnmXd4=; b=rAG0gPfdWsyWmy1P9g3gvzV5bNgiGgG98k5YOBlVlCfxdLKMKGkm3+cMRO3YkcURst mi5ztb0CV9VtSY4meOYnYLls0nxS40q0iENp4IHp26J5vqBEBQFVW6T7uAV8OM0eEjIP 29Mp2g+pauVBnbFwsz9Am4Ri2qTNWdE1rJVbY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007900; x=1722612700; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0xB8CrUuvcSfc2nlcH8MxzDzMULu8CCxjZFQajnmXd4=; b=Bm/8QvXjX8DZZQNWBvqqvHaK2lGvbOZJ08LiuyrV9U29jnsrDFtf5UGTuLLIRAx6vL wHOcVV32b4wBo0GEv2XQy2A6OoWTF3sTpI0PQYBS5gjSnMh7lnYqqv7XPSMMdlrMWyYU 5NYwoo134A9lHKJzK1KK2t4Z5U+ouPihQgN6ZM9f0MyM8frh/TAN7aw+5TfckGvVPTYM gaRst43bB8WdeqwFAqqo3blk+b7cEXXjs7iEQNhSizatEK+OAw0sj9EZcCKUwj4NcBLN 2HJKTLuwoWwujlrerNdfWFsrcCkI1uLguSPQAouxKWYF4QaSVlaPg9nIv/LIH/7Xiko9 MOBA== X-Gm-Message-State: AOJu0YwHBrJPcQbqY4o0+6adH15gbooEzE8DiGNrzzeSqKSuIl8MW1gR 9zBHZNrYlYscTfCPabXWK2807d6bHbY4q+0jS30AHTZYjiDhKto0RPGWlNqk8MxOGCjpPP0jsJB / X-Google-Smtp-Source: AGHT+IG5blLKfHhCf2QVQb7PvatlQw/jrtr6nl9TiOaAlzbLO5OiY1VzEjatQoI/IW6I/wUCr86o/w== X-Received: by 2002:a05:620a:17a4:b0:79f:62a:808 with SMTP id af79cd13be357-7a1e52fbc86mr6582685a.61.1722007899832; Fri, 26 Jul 2024 08:31:39 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 06/22] x86/mm: introduce a local domain variable to write_ptbase() Date: Fri, 26 Jul 2024 17:21:50 +0200 Message-ID: <20240726152206.28411-7-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 This reduces the repeated accessing of v->domain. No functional change intended. Signed-off-by: Roger Pau Monné Acked-by: Andrew Cooper --- xen/arch/x86/mm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index ca3d116b0e05..a792a300a866 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -517,13 +517,14 @@ void make_cr3(struct vcpu *v, mfn_t mfn) void write_ptbase(struct vcpu *v) { + const struct domain *d = v->domain; struct cpu_info *cpu_info = get_cpu_info(); unsigned long new_cr4; - new_cr4 = (is_pv_vcpu(v) && !is_idle_vcpu(v)) + new_cr4 = (is_pv_domain(d) && !is_idle_domain(d)) ? pv_make_cr4(v) : mmu_cr4_features; - if ( is_pv_vcpu(v) && v->domain->arch.pv.xpti ) + if ( is_pv_domain(d) && d->arch.pv.xpti ) { cpu_info->root_pgt_changed = true; cpu_info->pv_cr3 = __pa(this_cpu(root_pgt)); From patchwork Fri Jul 26 15:21:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B3AFC3DA7E for ; Fri, 26 Jul 2024 15:31:52 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765515.1176177 (Exim 4.92) (envelope-from ) id 1sXMv2-0001Vu-Un; Fri, 26 Jul 2024 15:31:44 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765515.1176177; Fri, 26 Jul 2024 15:31:44 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv2-0001VO-Qx; Fri, 26 Jul 2024 15:31:44 +0000 Received: by outflank-mailman (input) for mailman id 765515; Fri, 26 Jul 2024 15:31:44 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv2-00084Z-1b for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:44 +0000 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [2607:f8b0:4864:20::731]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2906e854-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:43 +0200 (CEST) Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-7a1d6f47112so50000185a.0 for ; Fri, 26 Jul 2024 08:31:43 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73ed33bsm186240685a.58.2024.07.26.08.31.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:41 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2906e854-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007902; x=1722612702; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dOEvkcqnun3SiCPLNVGMeM5jDZg54ALoRCfw9d7wQaw=; b=cpzIamfP6qyJrMXqifj4si5HGpo+rD0LWImppcPOkWOEzHON5ysnDTuAB7cXnX/hu+ /zXU7NB8A+TnVPUhQahtF0fo6ql+IsG3njJ9JNzfBwa3VKukMbdCKTnlI4eSSjUow0CB Mmju8PSGLwm5/0l6cakrayTanupNpq1pVlGTs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007902; x=1722612702; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dOEvkcqnun3SiCPLNVGMeM5jDZg54ALoRCfw9d7wQaw=; b=E77D+qEHWgyUtsTep0HW4BVklsRwKyaJHH5ULA1gmqXVQUKJH0sasfUDZyC9pbz4mm mxdBJR1y1UdXmbIuPscL+R4jGjsZ1apMgolyQ9HJIaIrN8CBKGzox3i2dLtrG3i5jelg nkMv6WL/gf6FLpYG2QQw0KkehHuOhuD7grIqDHqE0unEJ5/oNRgPVL714TV3d+PWP9nK 6RJ5Lpk79lV2HYbx3OKG7ULgXFLiAk2cXdqrBx5yGxvV+odMP1QvXp0caHepqFlHm4dK QQZXUJjOQDH8DVtGq0FQV5uYBH4lZmSsQmxu3Vz+3nGaEgQgZMQqGR+CQouiUb+/y+vo l6xQ== X-Gm-Message-State: AOJu0Yzf/kdtDthaOkdP8ArgolrXGQXPaOfD/nuusjM+CwzNlSgyNpdp NG5MMYGaRjq83SS7us1evTWRpwL+vxD354iCI9xHUGQgoNq+uabpnzB/jwCwYHpZXZKQYFLizkx / X-Google-Smtp-Source: AGHT+IGDtXcbJU9fFL56JcrQei4HNt94c4Zq8RiEGD8vSM+2WteFX2PTYvTW9/iZo0GgHcOnjLs8aw== X-Received: by 2002:a05:620a:17a7:b0:79f:1711:29fa with SMTP id af79cd13be357-7a1e52398f4mr14269485a.1.1722007901936; Fri, 26 Jul 2024 08:31:41 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 07/22] x86/spec-ctrl: initialize per-domain XPTI in spec_ctrl_init_domain() Date: Fri, 26 Jul 2024 17:21:51 +0200 Message-ID: <20240726152206.28411-8-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 XPTI being a speculation mitigation feels better to be initialized in spec_ctrl_init_domain(). No functional change intended, although the call to spec_ctrl_init_domain() in arch_domain_create() needs to be moved ahead of pv_domain_initialise() for d->->arch.pv.xpti to be correctly set. Move it ahead of most of the initialization functions, since spec_ctrl_init_domain() doesn't depend on any member in the struct domain being set. Signed-off-by: Roger Pau Monné Reviewed-by: Jan Beulich --- xen/arch/x86/domain.c | 4 ++-- xen/arch/x86/pv/domain.c | 2 -- xen/arch/x86/spec_ctrl.c | 4 ++++ 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index ccadfe0c9e70..3d3c14dbb5ae 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -842,6 +842,8 @@ int arch_domain_create(struct domain *d, is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u; #endif + spec_ctrl_init_domain(d); + if ( (rc = paging_domain_init(d)) != 0 ) goto fail; paging_initialised = true; @@ -908,8 +910,6 @@ int arch_domain_create(struct domain *d, d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED; - spec_ctrl_init_domain(d); - return 0; fail: diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 2a445bb17b99..86b74fb372d5 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -383,8 +383,6 @@ int pv_domain_initialise(struct domain *d) d->arch.ctxt_switch = &pv_csw; - d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; - if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) { diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index 40f6ae017010..5dc7a17b9354 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -1769,6 +1769,10 @@ void spec_ctrl_init_domain(struct domain *d) (ibpb ? SCF_entry_ibpb : 0) | (bhb ? SCF_entry_bhb : 0) | 0; + + if ( pv ) + d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom + : opt_xpti_domu; } void __init init_speculation_mitigations(void) From patchwork Fri Jul 26 15:21:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0AC0AC3DA49 for ; Fri, 26 Jul 2024 15:31:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765516.1176186 (Exim 4.92) (envelope-from ) id 1sXMv7-00020q-DK; Fri, 26 Jul 2024 15:31:49 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765516.1176186; Fri, 26 Jul 2024 15:31:49 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv7-00020c-9G; Fri, 26 Jul 2024 15:31:49 +0000 Received: by outflank-mailman (input) for mailman id 765516; Fri, 26 Jul 2024 15:31:47 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv5-00084T-D7 for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:47 +0000 Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [2607:f8b0:4864:20::f2a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2a52ccd7-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:45 +0200 (CEST) Received: by mail-qv1-xf2a.google.com with SMTP id 6a1803df08f44-6b79c969329so3887046d6.0 for ; Fri, 26 Jul 2024 08:31:45 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8d82a1sm17583836d6.8.2024.07.26.08.31.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:43 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2a52ccd7-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007904; x=1722612704; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XVLmi/fkvu4CTYAI+yfb6OgrqWm3rUKjjqk2EK0FSak=; b=HbZWYo47BAi0LT/8bPoeBP1TStinvU5gFwLXujMNhmmFvcCwbrhorVNOI6bFM31aM3 THhS4PiN+60SBBRxgxM/4+gXJYSg7bXefTnnkC/fd8J+6ue23TAmB37yNzOhTDGrOK60 WKwtWI3wMhbYkU6aTjuz1qxXtC0y40hH2EPp4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007904; x=1722612704; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XVLmi/fkvu4CTYAI+yfb6OgrqWm3rUKjjqk2EK0FSak=; b=A4SQbz0kW8PmOkn0TbL5DKmSkoUuM/WDdttwu7w0F6YE8JRsB0czuohCb6C04WADT1 tXIfFB4T3NWaUEGvpxpaLsJ4nHKZYBYimbEveXh1FCc5yWKJpzM0WD1p9yAo0rja0770 XnTuT0/9igfJKQn5FXw+MMa9+wa2LlqPRyGlz3Ubvjc9Fw2TpEpzrGyuDVod63Fyt5Xq NMQnNVSgazpY7zXvHaTNEMvicyOAhuOMdENW+3xsJaTEyiAl6pajfeEWKNSAoWoKmOhY x9FJmpMjO4RLO3EDVe7+SUCMg/A+POzn0Nm6DkGj0+oKk8KJdQia5x7XNFobxrMfuCWS 0QzA== X-Gm-Message-State: AOJu0YwJQmni7jSEhIXUz2qmLIsqdm3XEMSHMfj21B+NoF3RWcM6EIzt gk6fw2H5EHRgzhOkq4en/DAn0QTooKNU8Hgcq07cJdLIVexXPT8XK/kWP/RzIQYVEPSa/QHxldC 7 X-Google-Smtp-Source: AGHT+IEcufzqkmPcezP4vOETUa2u5a43wCguxoFnd/xXU0ayAqpb2EzoCg4TNkabrtG+7Xer6mXs8Q== X-Received: by 2002:a05:6214:d6a:b0:6b7:9b14:627b with SMTP id 6a1803df08f44-6bb55ace5bemr1746786d6.40.1722007903998; Fri, 26 Jul 2024 08:31:43 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper , Tim Deegan Subject: [PATCH 08/22] x86/mm: avoid passing a domain parameter to L4 init function Date: Fri, 26 Jul 2024 17:21:52 +0200 Message-ID: <20240726152206.28411-9-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 In preparation for the function being called from contexts where no domain is present. No functional change intended. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/mm.h | 4 +++- xen/arch/x86/mm.c | 24 +++++++++++++----------- xen/arch/x86/mm/hap/hap.c | 3 ++- xen/arch/x86/mm/shadow/hvm.c | 3 ++- xen/arch/x86/mm/shadow/multi.c | 7 +++++-- xen/arch/x86/pv/dom0_build.c | 3 ++- xen/arch/x86/pv/domain.c | 3 ++- 7 files changed, 29 insertions(+), 18 deletions(-) diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index b3853ae734fa..076e7009dc99 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -375,7 +375,9 @@ int devalidate_page(struct page_info *page, unsigned long type, void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d); void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, - const struct domain *d, mfn_t sl4mfn, bool ro_mpt); + mfn_t sl4mfn, const struct page_info *perdomain_l3, + bool ro_mpt, bool maybe_compat, bool short_directmap); + bool fill_ro_mpt(mfn_t mfn); void zap_ro_mpt(mfn_t mfn); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index a792a300a866..c01b6712143e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -1645,14 +1645,9 @@ static int promote_l3_table(struct page_info *page) * extended directmap. */ void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, - const struct domain *d, mfn_t sl4mfn, bool ro_mpt) + mfn_t sl4mfn, const struct page_info *perdomain_l3, + bool ro_mpt, bool maybe_compat, bool short_directmap) { - /* - * PV vcpus need a shortened directmap. HVM and Idle vcpus get the full - * directmap. - */ - bool short_directmap = !paging_mode_external(d); - /* Slot 256: RO M2P (if applicable). */ l4t[l4_table_offset(RO_MPT_VIRT_START)] = ro_mpt ? idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)] @@ -1673,13 +1668,14 @@ void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR_RW); /* Slot 260: Per-domain mappings. */ - l4t[l4_table_offset(PERDOMAIN_VIRT_START)] = - l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW); + if ( perdomain_l3 ) + l4t[l4_table_offset(PERDOMAIN_VIRT_START)] = + l4e_from_page(perdomain_l3, __PAGE_HYPERVISOR_RW); /* Slot 4: Per-domain mappings mirror. */ BUILD_BUG_ON(IS_ENABLED(CONFIG_PV32) && !l4_table_offset(PERDOMAIN_ALT_VIRT_START)); - if ( !is_pv_64bit_domain(d) ) + if ( perdomain_l3 && maybe_compat ) l4t[l4_table_offset(PERDOMAIN_ALT_VIRT_START)] = l4t[l4_table_offset(PERDOMAIN_VIRT_START)]; @@ -1710,6 +1706,10 @@ void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, else #endif { + /* + * PV vcpus need a shortened directmap. HVM and Idle vcpus get the full + * directmap. + */ unsigned int slots = (short_directmap ? ROOT_PAGETABLE_PV_XEN_SLOTS : ROOT_PAGETABLE_XEN_SLOTS); @@ -1830,7 +1830,9 @@ static int promote_l4_table(struct page_info *page) if ( !rc ) { init_xen_l4_slots(pl4e, l4mfn, - d, INVALID_MFN, VM_ASSIST(d, m2p_strict)); + INVALID_MFN, d->arch.perdomain_l3_pg, + VM_ASSIST(d, m2p_strict), !is_pv_64bit_domain(d), + true); atomic_inc(&d->arch.pv.nr_l4_pages); } unmap_domain_page(pl4e); diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index d2011fde2462..c8514ca0e917 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -402,7 +402,8 @@ static mfn_t hap_make_monitor_table(struct vcpu *v) m4mfn = page_to_mfn(pg); l4e = map_domain_page(m4mfn); - init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); + init_xen_l4_slots(l4e, m4mfn, INVALID_MFN, d->arch.perdomain_l3_pg, + false, true, false); unmap_domain_page(l4e); return m4mfn; diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c index c16f3b3adf32..93922a71e511 100644 --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -758,7 +758,8 @@ mfn_t sh_make_monitor_table(const struct vcpu *v, unsigned int shadow_levels) * shadow-linear mapping will either be inserted below when creating * lower level monitor tables, or later in sh_update_cr3(). */ - init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); + init_xen_l4_slots(l4e, m4mfn, INVALID_MFN, d->arch.perdomain_l3_pg, + false, true, false); if ( shadow_levels < 4 ) { diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index 376f6823cd44..0def0c073ca8 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -973,8 +973,11 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 shadow_type) BUILD_BUG_ON(sizeof(l4_pgentry_t) != sizeof(shadow_l4e_t)); - init_xen_l4_slots(l4t, gmfn, d, smfn, (!is_pv_32bit_domain(d) && - VM_ASSIST(d, m2p_strict))); + init_xen_l4_slots(l4t, gmfn, smfn, + d->arch.perdomain_l3_pg, + (!is_pv_32bit_domain(d) && + VM_ASSIST(d, m2p_strict)), + !is_pv_64bit_domain(d), true); unmap_domain_page(l4t); } break; diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index 41772dbe80bf..6a6689f402bb 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -711,7 +711,8 @@ int __init dom0_construct_pv(struct domain *d, l4start = l4tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; clear_page(l4tab); init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)), - d, INVALID_MFN, true); + INVALID_MFN, d->arch.perdomain_l3_pg, + true, !is_pv_64bit_domain(d), true); v->arch.guest_table = pagetable_from_paddr(__pa(l4start)); } else diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 86b74fb372d5..6ff71f14a2f2 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -124,7 +124,8 @@ static int setup_compat_l4(struct vcpu *v) mfn = page_to_mfn(pg); l4tab = map_domain_page(mfn); clear_page(l4tab); - init_xen_l4_slots(l4tab, mfn, v->domain, INVALID_MFN, false); + init_xen_l4_slots(l4tab, mfn, INVALID_MFN, v->domain->arch.perdomain_l3_pg, + false, true, true); unmap_domain_page(l4tab); /* This page needs to look like a pagetable so that it can be shadowed */ From patchwork Fri Jul 26 15:21:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B7ABC3DA70 for ; Fri, 26 Jul 2024 15:31:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765519.1176197 (Exim 4.92) (envelope-from ) id 1sXMv9-0002PZ-3G; Fri, 26 Jul 2024 15:31:51 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765519.1176197; Fri, 26 Jul 2024 15:31:51 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv8-0002Nc-Sa; Fri, 26 Jul 2024 15:31:50 +0000 Received: by outflank-mailman (input) for mailman id 765519; Fri, 26 Jul 2024 15:31:49 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv7-00084T-IB for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:49 +0000 Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [2607:f8b0:4864:20::32c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2b975213-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:47 +0200 (CEST) Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-7093abb12edso645859a34.3 for ; Fri, 26 Jul 2024 08:31:47 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3faf858csm17506806d6.128.2024.07.26.08.31.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:45 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2b975213-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007906; x=1722612706; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oS3y4Ocny/CCyKe8xJHGMa6gk6B7qtxm6NZOH5KyGFA=; b=FweycZPOEeJWazdD7sZvnpe5b2hxzBOz9JV55v4lpo3KnNJVIr3kBlJCJv2aLTIKEF R9p5fiQTIcUBP4c9kSxkrMo2b8QfMb6+PIaHIhEggeisOvYXXGCaqd5uKfd8Et546XB2 /2hEWHETZgVex5ho7aD42df2Ri2FdNzw9K+jA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007906; x=1722612706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oS3y4Ocny/CCyKe8xJHGMa6gk6B7qtxm6NZOH5KyGFA=; b=s2hVURYr8jyDsHUTXuL53bUEdfyb7DzqD733k2oEawwrnOg6WJooD2DxOhQx0TziGA wtVu82uomwsletNTd9QGSCWo1Pv5zlY+bOhRpvuvaCHTDeTzrVf66QTxvxWHmhzRelHD BZRxGPMY4sjt786ODRyUctXJnklHL9EK2QTvh2rwkQpEWO4Xu+DDxB4fSJE143m68xB6 M2VAT4UALiRUrrBUFJyBgdp7LKyYvNptZk6SQYrQ6q/ibBL09fwpJK7qeifngGmLlpM4 TXFvA4f7I4kV3jHRF5PbQP0FxWm6JR2lrdHnF5nfNdjQfLHkW6I3Gq40GMAHxeb4YZuD NyDg== X-Gm-Message-State: AOJu0YwzxYsEATVJWUA/aZh98A2CoNHtzm3ToUL/9gPwM/7xfq4X5KdP ho2Fu2DK4jf7WN0NXjSGmz84RjdolHFlhBwCtzeA0KqM7MymcsESaTbIY+cwJguUqdoZrIBbuAs 0 X-Google-Smtp-Source: AGHT+IEI/SmxM1OVVQwG596ftYXa8ugL3WbsTaRvUruPRnqPk3jpUVflDTDvv9TDG0yYGgBR1OTsCQ== X-Received: by 2002:a05:6830:668b:b0:703:5c2d:56a7 with SMTP id 46e09a7af769-7093248b4b5mr6952622a34.24.1722007906135; Fri, 26 Jul 2024 08:31:46 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 09/22] x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI Date: Fri, 26 Jul 2024 17:21:53 +0200 Message-ID: <20240726152206.28411-10-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 The current logic gates issuing flush TLB requests with the FLUSH_ROOT_PGTBL flag to XPTI being enabled. In preparation for FLUSH_ROOT_PGTBL also being needed when not using XPTI, untie it from the xpti domain boolean and instead introduce a new flush_root_pt field. No functional change intended, as flush_root_pt == xpti. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 2 ++ xen/arch/x86/include/asm/flushtlb.h | 2 +- xen/arch/x86/mm.c | 2 +- xen/arch/x86/pv/domain.c | 2 ++ 4 files changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index f5daeb182baa..9dd2e047f4de 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -283,6 +283,8 @@ struct pv_domain bool pcid; /* Mitigate L1TF with shadow/crashing? */ bool check_l1tf; + /* Issue FLUSH_ROOT_PGTBL for root page-table changes. */ + bool flush_root_pt; /* map_domain_page() mapping cache. */ struct mapcache_domain mapcache; diff --git a/xen/arch/x86/include/asm/flushtlb.h b/xen/arch/x86/include/asm/flushtlb.h index bb0ad58db49b..1b98d03decdc 100644 --- a/xen/arch/x86/include/asm/flushtlb.h +++ b/xen/arch/x86/include/asm/flushtlb.h @@ -177,7 +177,7 @@ void flush_area_mask(const cpumask_t *mask, const void *va, #define flush_root_pgtbl_domain(d) \ { \ - if ( is_pv_domain(d) && (d)->arch.pv.xpti ) \ + if ( is_pv_domain(d) && (d)->arch.pv.flush_root_pt ) \ flush_mask((d)->dirty_cpumask, FLUSH_ROOT_PGTBL); \ } diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index c01b6712143e..a1ac7bdc5b44 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4167,7 +4167,7 @@ long do_mmu_update( cmd == MMU_PT_UPDATE_PRESERVE_AD, v); if ( !rc ) flush_linear_pt = true; - if ( !rc && pt_owner->arch.pv.xpti ) + if ( !rc && pt_owner->arch.pv.flush_root_pt ) { bool local_in_use = false; diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 6ff71f14a2f2..46ee10a8a4c2 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -384,6 +384,8 @@ int pv_domain_initialise(struct domain *d) d->arch.ctxt_switch = &pv_csw; + d->arch.pv.flush_root_pt = d->arch.pv.xpti; + if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) { From patchwork Fri Jul 26 15:21:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7761CC3DA7E for ; Fri, 26 Jul 2024 15:31:59 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765521.1176207 (Exim 4.92) (envelope-from ) id 1sXMvA-0002mi-Jr; Fri, 26 Jul 2024 15:31:52 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765521.1176207; Fri, 26 Jul 2024 15:31:52 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvA-0002lw-AW; Fri, 26 Jul 2024 15:31:52 +0000 Received: by outflank-mailman (input) for mailman id 765521; Fri, 26 Jul 2024 15:31:50 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMv8-00084Z-Fm for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:50 +0000 Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [2607:f8b0:4864:20::f33]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2ce49961-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:50 +0200 (CEST) Received: by mail-qv1-xf33.google.com with SMTP id 6a1803df08f44-6b797234b09so6976056d6.0 for ; Fri, 26 Jul 2024 08:31:50 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3faf858csm17507206d6.128.2024.07.26.08.31.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:48 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2ce49961-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007908; x=1722612708; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ufe7axpF1ypMkRxpc+P1Xp8b7quSOOvHVvX3frr6ICk=; b=TyDYpBwrBC9nFO9DFal52naltESj2OEJ01croPBXbqBGwzv6BArT7pdZLC0CQfTGbk IQKxa+F875U+FsrEZj6i/RasPvv6o1h2c2IcePYcGPvEOxfbpJAGh2abLDAG4NpCe78c acyPsY5pk5m4EApMMw6ceyVlKDVsF8mw3EP7U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007908; x=1722612708; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ufe7axpF1ypMkRxpc+P1Xp8b7quSOOvHVvX3frr6ICk=; b=FXydynoG708w1rGsnX+hsXcUGyLAHsDOixxlJ7eszVjOLja411xqRH73tU/dCZwdyr R3yFn5UV5Ag9/okreSigPalTheT9WxFOmkpNMETO4zKKFplVF01XLJmayxnYjcRmAUhK GeYJZFghNfkt9PTiPtFlRn+PdqMZvTIu5NeBq7L9WJDRScJl53SDUPcB5Ap9w/YQSIX7 QXR761ACaqoQcQ6Yx4Av7NVhSxFcmdcHNU6L2pdvI68vC77k8u2M7NQHaIhPEmghylPz Tr/EEHVIkfG9QrMe0gT9qjKTKxfKY0ucTXC5dJ0jd5GuJR6kl44dvTcsDxDiCQ49XSn9 T60A== X-Gm-Message-State: AOJu0YwXN8H++EsmMI3rugLUn+sB/KwKkL8ZejNR5aHOS9+JPIMdkSI4 kktJrtIPXsE8ezRkwpRdj03n04KJ3r7vTSsRt0uXHohUlPyg5DXI+a+Jn2DVG3hBRcXMnMDWEgY D X-Google-Smtp-Source: AGHT+IHc7fZj2qbt+CZssxa2I5bm4GBAKmtWk5QMumzkfy0VWkMpko7m2q+uj3dIHAa/OM3tGAsQnQ== X-Received: by 2002:a05:6214:1785:b0:6b7:9a53:70e9 with SMTP id 6a1803df08f44-6b99154d6d4mr154646366d6.17.1722007908375; Fri, 26 Jul 2024 08:31:48 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 10/22] x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush Date: Fri, 26 Jul 2024 17:21:54 +0200 Message-ID: <20240726152206.28411-11-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Move the handling of FLUSH_ROOT_PGTBL in flush_area_local() ahead of the logic that does the TLB flushing, in preparation for further changes requiring the TLB flush to be strictly done after having handled FLUSH_ROOT_PGTBL. No functional change intended. Signed-off-by: Roger Pau Monné --- xen/arch/x86/flushtlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index 18748b2bc805..fd5ed16ffb57 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -191,6 +191,9 @@ unsigned int flush_area_local(const void *va, unsigned int flags) { unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; + if ( flags & FLUSH_ROOT_PGTBL ) + get_cpu_info()->root_pgt_changed = true; + if ( flags & (FLUSH_TLB|FLUSH_TLB_GLOBAL) ) { if ( order == 0 ) @@ -254,9 +257,6 @@ unsigned int flush_area_local(const void *va, unsigned int flags) } } - if ( flags & FLUSH_ROOT_PGTBL ) - get_cpu_info()->root_pgt_changed = true; - return flags; } From patchwork Fri Jul 26 15:21:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE6C8C3DA4A for ; Fri, 26 Jul 2024 15:32:01 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765528.1176217 (Exim 4.92) (envelope-from ) id 1sXMvC-0003Ew-Np; Fri, 26 Jul 2024 15:31:54 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765528.1176217; Fri, 26 Jul 2024 15:31:54 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvC-0003ES-IY; Fri, 26 Jul 2024 15:31:54 +0000 Received: by outflank-mailman (input) for mailman id 765528; Fri, 26 Jul 2024 15:31:52 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvA-00084Z-O4 for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:52 +0000 Received: from mail-ot1-x32e.google.com (mail-ot1-x32e.google.com [2607:f8b0:4864:20::32e]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2e24b3cc-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:52 +0200 (CEST) Received: by mail-ot1-x32e.google.com with SMTP id 46e09a7af769-709340f1cb1so493262a34.3 for ; Fri, 26 Jul 2024 08:31:52 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8fccc8sm17632336d6.46.2024.07.26.08.31.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:50 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2e24b3cc-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007910; x=1722612710; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6hWSsSoEx0LARRjSkgW9NcJZyDQpP6n7zvGlI2J4wyQ=; b=v7xk+VQpkTsZaEZbp0MkbxyouX2fEB72+40E+mLROAPvWNn58REHd/zvN3AquF+Ysv UW0racs50Rl1qNQbmjlaFIlSocIJHwphDdP0fx8CRSCwEu6ZD+YBg472Tjtfqzv2r4xt 74/VQwUmnHAsqRTUs/4RMKofyza/wXZNQWF4o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007910; x=1722612710; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6hWSsSoEx0LARRjSkgW9NcJZyDQpP6n7zvGlI2J4wyQ=; b=ROtqgqIGTXjVk5X7sc1Zr195RGha1xKHygC+WYDy6JVc+r5mOjOwjni1CAAqZu4Clo s32cIojzSsLMwFAtr8Qnrxwz8AJ/kUskc5pjt83O2+Xp18Kj5ZFeqDB067ettVBryHpe SJEqsCVMNymPZIM7VoljhJYlPgyfLBiytt7xLqIZINHVg3bsfzCDd5I7/2Yv7gsuvixV KosdqHo9fd1LHJC90gNEaqn0XzM0qGryrDvsz/1vJPB31Ju4Zr2ETrHAkccH/kgqjhW4 D1H5U6JB+8vcAmNJKuB+b47zkj7ryjP01NpQnqfNRREY6731Ep94IYA8ZbOHyXf8cceL fAvA== X-Gm-Message-State: AOJu0Yx8GaEeKhyrjd1l0hWKmKwiVvxLgOipinrfv6RamsXlg5QPfOE3 y65oSTb/MPWZ3kaBvBmYYC/tJWW8KhdRQ2Mx4eHXK3lNr103FaQ1F+JEr7D7ahWEQMxbe3oV6C/ W X-Google-Smtp-Source: AGHT+IHxlzX8le2A7D93UfdcVgHhPUz046St0tOWOrUfJUB4rrBX2JMUFaM2f0ryExX5KYWoktrEzg== X-Received: by 2002:a05:6830:9c3:b0:708:b2c6:bb41 with SMTP id 46e09a7af769-709321125famr6342877a34.2.1722007910496; Fri, 26 Jul 2024 08:31:50 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 11/22] x86/mm: split setup of the per-domain slot on context switch Date: Fri, 26 Jul 2024 17:21:55 +0200 Message-ID: <20240726152206.28411-12-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 It's currently only used for XPTI. Move the code to a separate helper in preparation for it gaining more logic. While there switch to using l4e_write(): in the current context the L4 is not active when modified, but that could change. No functional change intended. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain.c | 4 +--- xen/arch/x86/include/asm/mm.h | 3 +++ xen/arch/x86/mm.c | 7 +++++++ 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 3d3c14dbb5ae..9cfcf0dc63f3 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1918,9 +1918,7 @@ void cf_check paravirt_ctxt_switch_to(struct vcpu *v) root_pgentry_t *root_pgt = this_cpu(root_pgt); if ( root_pgt ) - root_pgt[root_table_offset(PERDOMAIN_VIRT_START)] = - l4e_from_page(v->domain->arch.perdomain_l3_pg, - __PAGE_HYPERVISOR_RW); + setup_perdomain_slot(v, root_pgt); if ( unlikely(v->arch.dr7 & DR7_ACTIVE_MASK) ) activate_debugregs(v); diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 076e7009dc99..2c309f7b1444 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -630,4 +630,7 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr) return (mfn + nr) <= (virt_to_mfn(eva - 1) + 1); } +/* Setup the per-domain slot in the root page table pointer. */ +void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt); + #endif /* __ASM_X86_MM_H__ */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index a1ac7bdc5b44..35e929057d21 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6362,6 +6362,13 @@ unsigned long get_upper_mfn_bound(void) return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1; } +void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt) +{ + l4e_write(&root_pgt[root_table_offset(PERDOMAIN_VIRT_START)], + l4e_from_page(v->domain->arch.perdomain_l3_pg, + __PAGE_HYPERVISOR_RW)); +} + static void __init __maybe_unused build_assertions(void) { /* From patchwork Fri Jul 26 15:21:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9A628C3DA49 for ; Fri, 26 Jul 2024 15:32:06 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765531.1176226 (Exim 4.92) (envelope-from ) id 1sXMvG-0003uH-40; Fri, 26 Jul 2024 15:31:58 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765531.1176226; Fri, 26 Jul 2024 15:31:58 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvF-0003ti-VA; Fri, 26 Jul 2024 15:31:57 +0000 Received: by outflank-mailman (input) for mailman id 765531; Fri, 26 Jul 2024 15:31:56 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvE-00084T-T5 for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:31:56 +0000 Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [2607:f8b0:4864:20::333]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2fd9cbbd-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:55 +0200 (CEST) Received: by mail-ot1-x333.google.com with SMTP id 46e09a7af769-703ba2477bdso609366a34.2 for ; Fri, 26 Jul 2024 08:31:55 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3fac6b6csm17414816d6.116.2024.07.26.08.31.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:52 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2fd9cbbd-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007913; x=1722612713; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RwqVraHAOEdS2c3l2+7Y1e8iYSimDDq5AdSITnK40sk=; b=eXq5YPJ8JeORhYxt8v4nZy1hZ3nSNsdw03RhttIB3usq7Z6UAQ14XGc4f8AJh10syD vrxI0Dig0nA/P2zwIbGDQRDF+ojuz5dnoMoQlJ+0VYS17k0BQU9Y+iUOem3zVEvKL5QY JT0SkrDF8gUbiKK8DQn8snBi7Zr3PktkPXFl4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007913; x=1722612713; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RwqVraHAOEdS2c3l2+7Y1e8iYSimDDq5AdSITnK40sk=; b=MvtSz+AypS/P+JaFC04qFtb/hgneqXurNBPWQ82eG0CH+Njc8u7+HTRq16G7U2xKqX kHSC6eiJLqsCU6N0ngn0gi3Fvn/F0xkLdOYS2AKdGJdnaDQGxbRf22yu5iBAOdr5dKsQ djPAHPGRll12+8V4tV2ww6jC1VcaXIV3gy8oUbcPx84rEb8FqdyAZvbFxKKJFgh/UCr6 e6mYci8uvHGXPF09Ny1KJBJH1dXCMM25G8dPl7Ncry0+74Rovo+fxhZI3UJgcgWppEzJ cNN4Bge0zKcJLPLVCZVF87GruQjhncWhbJqMI+zWIobaipAWCcLJBfnUO/ZexljPDqFf f04A== X-Gm-Message-State: AOJu0YxISWJWguPRkE7cQO2FsKJQkln8LmPEPWc9gb5J3RX/bCl7QE/J cs19Mn89+uMipMdkLsXEs9p0DPkZ0cfuaIk3StbrRZOQ4w2QtmQT0ukrOnQgd5OaCagYs1mTEzM q X-Google-Smtp-Source: AGHT+IEqRPmgIWN8QNqdUP8A5AuXysm1C0BZEZj9FBfcL0gwTK9jQh8X0JnqL2bI0Bd/lrn6iSOH3g== X-Received: by 2002:a05:6830:658e:b0:703:5ab1:64d4 with SMTP id 46e09a7af769-7093251b313mr5066647a34.29.1722007912714; Fri, 26 Jul 2024 08:31:52 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Andrew Cooper , Jan Beulich , Julien Grall , Stefano Stabellini Subject: [PATCH 12/22] x86/spec-ctrl: introduce Address Space Isolation command line option Date: Fri, 26 Jul 2024 17:21:56 +0200 Message-ID: <20240726152206.28411-13-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 No functional change, as the option is not used. Introduced new so newly added functionality is keyed on the option being enabled, even if the feature is non-functional. Signed-off-by: Roger Pau Monné --- docs/misc/xen-command-line.pandoc | 15 ++++-- xen/arch/x86/include/asm/domain.h | 3 ++ xen/arch/x86/include/asm/spec_ctrl.h | 2 + xen/arch/x86/spec_ctrl.c | 74 +++++++++++++++++++++++++--- 4 files changed, 81 insertions(+), 13 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index 98a45211556b..0ddc330428d9 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -2387,7 +2387,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). ### spec-ctrl (x86) > `= List of [ , xen=, {pv,hvm}=, -> {msr-sc,rsb,verw,{ibpb,bhb}-entry}=|{pv,hvm}=, +> {msr-sc,rsb,verw,{ibpb,bhb}-entry,asi}=|{pv,hvm}=, > bti-thunk=retpoline|lfence|jmp,bhb-seq=short|tsx|long, > {ibrs,ibpb,ssbd,psfd, > eager-fpu,l1d-flush,branch-harden,srb-lock, @@ -2414,10 +2414,10 @@ in place for guests to use. Use of a positive boolean value for either of these options is invalid. -The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=` and `bhb-entry=` -options offer fine grained control over the primitives by Xen. These impact -Xen's ability to protect itself, and/or Xen's ability to virtualise support -for guests to use. +The `pv=`, `hvm=`, `msr-sc=`, `rsb=`, `verw=`, `ibpb-entry=`, `bhb-entry=` and +`asi=` options offer fine grained control over the primitives by Xen. These +impact Xen's ability to protect itself, and/or Xen's ability to virtualise +support for guests to use. * `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests respectively. @@ -2449,6 +2449,11 @@ for guests to use. is not available (see `bhi-dis-s`). The choice of scrubbing sequence can be selected using the `bhb-seq=` option. If it is necessary to protect dom0 too, boot with `spec-ctrl=bhb-entry`. +* `asi=` offers control over whether the hypervisor will engage in Address + Space Isolation, by not having sensitive information mapped in the VMM + page-tables. Not having sensitive information on the page-tables avoids + having to perform some mitigations for speculative attacks when + context-switching to the hypervisor. If Xen was compiled with `CONFIG_INDIRECT_THUNK` support, `bti-thunk=` can be used to select which of the thunks gets patched into the diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index 9dd2e047f4de..8c366be8c75f 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -458,6 +458,9 @@ struct arch_domain /* Don't unconditionally inject #GP for unhandled MSRs. */ bool msr_relaxed; + /* Run the guest without sensitive information in the VMM page-tables. */ + bool asi; + /* Emulated devices enabled bitmap. */ uint32_t emulation_flags; } __cacheline_aligned; diff --git a/xen/arch/x86/include/asm/spec_ctrl.h b/xen/arch/x86/include/asm/spec_ctrl.h index 72347ef2b959..39963c004312 100644 --- a/xen/arch/x86/include/asm/spec_ctrl.h +++ b/xen/arch/x86/include/asm/spec_ctrl.h @@ -88,6 +88,8 @@ extern uint8_t default_scf; extern int8_t opt_xpti_hwdom, opt_xpti_domu; +extern int8_t opt_asi_pv, opt_asi_hwdom, opt_asi_hvm; + extern bool cpu_has_bug_l1tf; extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index 5dc7a17b9354..2e403aad791c 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -84,6 +84,11 @@ static bool __ro_after_init opt_verw_mmio; static int8_t __initdata opt_gds_mit = -1; static int8_t __initdata opt_div_scrub = -1; +/* Address Space Isolation for PV/HVM. */ +int8_t __ro_after_init opt_asi_pv = -1; +int8_t __ro_after_init opt_asi_hwdom = -1; +int8_t __ro_after_init opt_asi_hvm = -1; + static int __init cf_check parse_spec_ctrl(const char *s) { const char *ss; @@ -143,6 +148,10 @@ static int __init cf_check parse_spec_ctrl(const char *s) opt_unpriv_mmio = false; opt_gds_mit = 0; opt_div_scrub = 0; + + opt_asi_pv = 0; + opt_asi_hwdom = 0; + opt_asi_hvm = 0; } else if ( val > 0 ) rc = -EINVAL; @@ -162,6 +171,7 @@ static int __init cf_check parse_spec_ctrl(const char *s) opt_verw_pv = val; opt_ibpb_entry_pv = val; opt_bhb_entry_pv = val; + opt_asi_pv = val; } else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) { @@ -170,6 +180,7 @@ static int __init cf_check parse_spec_ctrl(const char *s) opt_verw_hvm = val; opt_ibpb_entry_hvm = val; opt_bhb_entry_hvm = val; + opt_asi_hvm = val; } else if ( (val = parse_boolean("msr-sc", s, ss)) != -1 ) { @@ -279,6 +290,27 @@ static int __init cf_check parse_spec_ctrl(const char *s) break; } } + else if ( (val = parse_boolean("asi", s, ss)) != -1 ) + { + switch ( val ) + { + case 0: + case 1: + opt_asi_pv = opt_asi_hwdom = opt_asi_hvm = val; + break; + + case -2: + s += strlen("asi="); + if ( (val = parse_boolean("pv", s, ss)) >= 0 ) + opt_asi_pv = val; + else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) + opt_asi_hvm = val; + else + default: + rc = -EINVAL; + break; + } + } /* Xen's speculative sidechannel mitigation settings. */ else if ( !strncmp(s, "bti-thunk=", 10) ) @@ -378,6 +410,13 @@ int8_t __ro_after_init opt_xpti_domu = -1; static __init void xpti_init_default(void) { + ASSERT(opt_asi_pv >= 0 && opt_asi_hwdom >= 0); + if ( (opt_xpti_hwdom == 1 || opt_xpti_domu == 1) && opt_asi_pv == 1 ) + { + printk(XENLOG_ERR + "XPTI is incompatible with Address Space Isolation - disabling ASI\n"); + opt_asi_pv = 0; + } if ( (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || cpu_has_rdcl_no ) { @@ -389,9 +428,9 @@ static __init void xpti_init_default(void) else { if ( opt_xpti_hwdom < 0 ) - opt_xpti_hwdom = 1; + opt_xpti_hwdom = !opt_asi_hwdom; if ( opt_xpti_domu < 0 ) - opt_xpti_domu = 1; + opt_xpti_domu = !opt_asi_pv; } } @@ -630,12 +669,13 @@ static void __init print_details(enum ind_thunk thunk) * mitigation support for guests. */ #ifdef CONFIG_HVM - printk(" Support for HVM VMs:%s%s%s%s%s%s%s%s\n", + printk(" Support for HVM VMs:%s%s%s%s%s%s%s%s%s\n", (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || boot_cpu_has(X86_FEATURE_SC_RSB_HVM) || boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) || opt_bhb_entry_hvm || amd_virt_spec_ctrl || - opt_eager_fpu || opt_verw_hvm) ? "" : " None", + opt_eager_fpu || opt_verw_hvm || + opt_asi_hvm) ? "" : " None", boot_cpu_has(X86_FEATURE_SC_MSR_HVM) ? " MSR_SPEC_CTRL" : "", (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || amd_virt_spec_ctrl) ? " MSR_VIRT_SPEC_CTRL" : "", @@ -643,22 +683,24 @@ static void __init print_details(enum ind_thunk thunk) opt_eager_fpu ? " EAGER_FPU" : "", opt_verw_hvm ? " VERW" : "", boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) ? " IBPB-entry" : "", - opt_bhb_entry_hvm ? " BHB-entry" : ""); + opt_bhb_entry_hvm ? " BHB-entry" : "", + opt_asi_hvm ? " ASI" : ""); #endif #ifdef CONFIG_PV - printk(" Support for PV VMs:%s%s%s%s%s%s%s\n", + printk(" Support for PV VMs:%s%s%s%s%s%s%s%s\n", (boot_cpu_has(X86_FEATURE_SC_MSR_PV) || boot_cpu_has(X86_FEATURE_SC_RSB_PV) || boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) || - opt_bhb_entry_pv || + opt_bhb_entry_pv || opt_asi_pv || opt_eager_fpu || opt_verw_pv) ? "" : " None", boot_cpu_has(X86_FEATURE_SC_MSR_PV) ? " MSR_SPEC_CTRL" : "", boot_cpu_has(X86_FEATURE_SC_RSB_PV) ? " RSB" : "", opt_eager_fpu ? " EAGER_FPU" : "", opt_verw_pv ? " VERW" : "", boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) ? " IBPB-entry" : "", - opt_bhb_entry_pv ? " BHB-entry" : ""); + opt_bhb_entry_pv ? " BHB-entry" : "", + opt_asi_pv ? " ASI" : ""); printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", opt_xpti_hwdom ? "enabled" : "disabled", @@ -1773,6 +1815,9 @@ void spec_ctrl_init_domain(struct domain *d) if ( pv ) d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; + + d->arch.asi = is_hardware_domain(d) ? opt_asi_hwdom + : pv ? opt_asi_pv : opt_asi_hvm; } void __init init_speculation_mitigations(void) @@ -2069,6 +2114,19 @@ void __init init_speculation_mitigations(void) hw_smt_enabled && default_xen_spec_ctrl ) setup_force_cpu_cap(X86_FEATURE_SC_MSR_IDLE); + /* Disable ASI by default until feature is finished. */ + if ( opt_asi_pv == -1 ) + opt_asi_pv = 0; + if ( opt_asi_hwdom == -1 ) + opt_asi_hwdom = 0; + if ( opt_asi_hvm == -1 ) + opt_asi_hvm = 0; + + if ( opt_asi_pv || opt_asi_hvm ) + warning_add( + "Address Space Isolation is not functional, this option is\n" + "intended to be used only for development purposes.\n"); + xpti_init_default(); l1tf_calculations(); From patchwork Fri Jul 26 15:21:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 90179C3DA4A for ; Fri, 26 Jul 2024 15:32:11 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765538.1176237 (Exim 4.92) (envelope-from ) id 1sXMvJ-0004WM-Re; Fri, 26 Jul 2024 15:32:01 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765538.1176237; Fri, 26 Jul 2024 15:32:01 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvJ-0004Va-Np; Fri, 26 Jul 2024 15:32:01 +0000 Received: by outflank-mailman (input) for mailman id 765538; Fri, 26 Jul 2024 15:32:00 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvH-00084T-Tp for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:00 +0000 Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [2607:f8b0:4864:20::330]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 3110a368-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:31:57 +0200 (CEST) Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-7035b2947a4so632328a34.3 for ; Fri, 26 Jul 2024 08:31:57 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3fa94e7csm17499786d6.91.2024.07.26.08.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:54 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3110a368-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007915; x=1722612715; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Bm6SwbrLrL2F0whPBohLSi3P90SQDvaLyu2bafK8Eyw=; b=BxdtNThbjEeZ+BYrnpa+C4wJ45o2O88QfK5jzQS8jusOtrdih2MXa7o8v20rUeBDvp v0DPIVXY511ZBBu3VwVkJTQJZQSUWGOU7seNodwjREMvKVx8d9qQk1Sc4g4ZsFWxYKme f8lde/VZ+/KV9MM0Eii73ol92SfYg0Obx8QIg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007915; x=1722612715; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bm6SwbrLrL2F0whPBohLSi3P90SQDvaLyu2bafK8Eyw=; b=AXq1+7KJHZ0MUyf9xctBNg7BHrmgjAz/CA0e447ccDLvPzOiaZkkCt30OyMn+EWrZc mOoEZVkTEU3ipGFtUVb99DaGKxHkyd4QlI41TzgTADb76ZrONV2K92dwH6ags39p9uM3 LxaCf1zW0/Pu7B3+qwWYDpTDMNVvSao2fe5Ajrz9F5HVF4TmjiVnAsp0C6HQR0psiOJ6 7+UMlcl9jZtT7563Xk+s94i/ZBIssRYTXnyx6Vge6bmxyD5iDtu0f9IYN/zmkvo2RdkN bUwUB0VH3YEkp9VFXQ3HK0qoT7q7xaEoHuwmOtSaHtihcjdjztdIEsZVK8zWEgSztNtF PRbw== X-Gm-Message-State: AOJu0Yyt9Lu2HScz2OitSeZExlS0CSe6CA4tm+tiZ/3bhwgpFQBSL9sX vr1wGEYQ7QlIGvVjpnfB5taDIx/7oy67KZihdZS7xouDmU9DtMzIakX4PrUOdXUSJ3zmftzPrrG i X-Google-Smtp-Source: AGHT+IHEhPUjKUQ4d5eJhvKBEzz6/++MROSwxGNSkIyqscU6O9baUOB7YZ72GYyoimg6xQnrkpiKKA== X-Received: by 2002:a05:6830:61c6:b0:703:79c6:a9ba with SMTP id 46e09a7af769-7092e695159mr8230070a34.7.1722007914842; Fri, 26 Jul 2024 08:31:54 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 13/22] x86/hvm: use a per-pCPU monitor table in HAP mode Date: Fri, 26 Jul 2024 17:21:57 +0200 Message-ID: <20240726152206.28411-14-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Instead of allocating a monitor table for each vCPU when running in HVM HAP mode, use a per-pCPU monitor table, which gets the per-domain slot updated on guest context switch. This limits the amount of memory used for HVM HAP monitor tables to the amount of active pCPUs, rather than to the number of vCPUs. It also simplifies vCPU allocation and teardown, since the monitor table handling is removed from there. Note the switch to using a per-CPU monitor table is done regardless of whether Address Space Isolation is enabled or not. Partly for the memory usage reduction, and also because it allows to simplify the VM tear down path by not having to cleanup the per-vCPU monitor tables. Signed-off-by: Roger Pau Monné --- Note the monitor table is not made static because uses outside of the file where it's defined will be added by further patches. --- xen/arch/x86/hvm/hvm.c | 60 ++++++++++++++++++++++++ xen/arch/x86/hvm/svm/svm.c | 5 ++ xen/arch/x86/hvm/vmx/vmcs.c | 1 + xen/arch/x86/hvm/vmx/vmx.c | 4 ++ xen/arch/x86/include/asm/hap.h | 1 - xen/arch/x86/include/asm/hvm/hvm.h | 8 ++++ xen/arch/x86/mm.c | 8 ++++ xen/arch/x86/mm/hap/hap.c | 75 ------------------------------ xen/arch/x86/mm/paging.c | 4 +- 9 files changed, 87 insertions(+), 79 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 7f4b627b1f5f..3f771bc65677 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -104,6 +104,54 @@ static const char __initconst warning_hvm_fep[] = static bool __initdata opt_altp2m_enabled; boolean_param("altp2m", opt_altp2m_enabled); +DEFINE_PER_CPU(root_pgentry_t *, monitor_pgt); + +static int allocate_cpu_monitor_table(unsigned int cpu) +{ + root_pgentry_t *pgt = alloc_xenheap_page(); + + if ( !pgt ) + return -ENOMEM; + + clear_page(pgt); + + init_xen_l4_slots(pgt, _mfn(virt_to_mfn(pgt)), INVALID_MFN, NULL, + false, true, false); + + ASSERT(!per_cpu(monitor_pgt, cpu)); + per_cpu(monitor_pgt, cpu) = pgt; + + return 0; +} + +static void free_cpu_monitor_table(unsigned int cpu) +{ + root_pgentry_t *pgt = per_cpu(monitor_pgt, cpu); + + if ( !pgt ) + return; + + per_cpu(monitor_pgt, cpu) = NULL; + free_xenheap_page(pgt); +} + +void hvm_set_cpu_monitor_table(struct vcpu *v) +{ + root_pgentry_t *pgt = this_cpu(monitor_pgt); + + ASSERT(pgt); + + setup_perdomain_slot(v, pgt); + + make_cr3(v, _mfn(virt_to_mfn(pgt))); +} + +void hvm_clear_cpu_monitor_table(struct vcpu *v) +{ + /* Poison %cr3, it will be updated when the vCPU is scheduled. */ + make_cr3(v, INVALID_MFN); +} + static int cf_check cpu_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -113,6 +161,9 @@ static int cf_check cpu_callback( switch ( action ) { case CPU_UP_PREPARE: + rc = allocate_cpu_monitor_table(cpu); + if ( rc ) + break; rc = alternative_call(hvm_funcs.cpu_up_prepare, cpu); break; case CPU_DYING: @@ -121,6 +172,7 @@ static int cf_check cpu_callback( case CPU_UP_CANCELED: case CPU_DEAD: alternative_vcall(hvm_funcs.cpu_dead, cpu); + free_cpu_monitor_table(cpu); break; default: break; @@ -154,6 +206,7 @@ static bool __init hap_supported(struct hvm_function_table *fns) static int __init cf_check hvm_enable(void) { const struct hvm_function_table *fns = NULL; + int rc; if ( cpu_has_vmx ) fns = start_vmx(); @@ -205,6 +258,13 @@ static int __init cf_check hvm_enable(void) register_cpu_notifier(&cpu_nfb); + rc = allocate_cpu_monitor_table(0); + if ( rc ) + { + printk(XENLOG_ERR "Error %d setting up HVM monitor page tables\n", rc); + return rc; + } + return 0; } presmp_initcall(hvm_enable); diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 988250dbc154..a3fc033c0100 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -902,6 +902,8 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v) if ( unlikely((read_efer() & EFER_SVME) == 0) ) return; + hvm_clear_cpu_monitor_table(v); + if ( !v->arch.fully_eager_fpu ) svm_fpu_leave(v); @@ -957,6 +959,8 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v) ASSERT(v->domain->arch.cpuid->extd.virt_ssbd); amd_set_legacy_ssbd(true); } + + hvm_set_cpu_monitor_table(v); } static void noreturn cf_check svm_do_resume(void) @@ -990,6 +994,7 @@ static void noreturn cf_check svm_do_resume(void) hvm_migrate_pirqs(v); /* Migrating to another ASID domain. Request a new ASID. */ hvm_asid_flush_vcpu(v); + hvm_update_host_cr3(v); } if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) ) diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 9b6dc51f36ab..5d67c8157825 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1957,6 +1957,7 @@ void cf_check vmx_do_resume(void) v->arch.hvm.vmx.hostenv_migrated = 1; hvm_asid_flush_vcpu(v); + hvm_update_host_cr3(v); } debug_state = v->domain->debugger_attached diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index cbe91c679807..5863c57b2d4a 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1153,6 +1153,8 @@ static void cf_check vmx_ctxt_switch_from(struct vcpu *v) if ( unlikely(!this_cpu(vmxon)) ) return; + hvm_clear_cpu_monitor_table(v); + if ( !v->is_running ) { /* @@ -1182,6 +1184,8 @@ static void cf_check vmx_ctxt_switch_to(struct vcpu *v) if ( v->domain->arch.hvm.pi_ops.flags & PI_CSW_TO ) vmx_pi_switch_to(v); + + hvm_set_cpu_monitor_table(v); } diff --git a/xen/arch/x86/include/asm/hap.h b/xen/arch/x86/include/asm/hap.h index f01ce73fb4f3..ae6760bc2bf5 100644 --- a/xen/arch/x86/include/asm/hap.h +++ b/xen/arch/x86/include/asm/hap.h @@ -24,7 +24,6 @@ int hap_domctl(struct domain *d, struct xen_domctl_shadow_op *sc, XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl); int hap_enable(struct domain *d, u32 mode); void hap_final_teardown(struct domain *d); -void hap_vcpu_teardown(struct vcpu *v); void hap_teardown(struct domain *d, bool *preempted); void hap_vcpu_init(struct vcpu *v); int hap_track_dirty_vram(struct domain *d, diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h index 1c01e22c8e62..6d9a1ae04feb 100644 --- a/xen/arch/x86/include/asm/hvm/hvm.h +++ b/xen/arch/x86/include/asm/hvm/hvm.h @@ -550,6 +550,14 @@ static inline void hvm_invlpg(struct vcpu *v, unsigned long linear) (1U << X86_EXC_AC) | \ (1U << X86_EXC_MC)) +/* + * Setup the per-domain slots of the per-cpu monitor table and update the vCPU + * cr3 to use it. + */ +DECLARE_PER_CPU(root_pgentry_t *, monitor_pgt); +void hvm_set_cpu_monitor_table(struct vcpu *v); +void hvm_clear_cpu_monitor_table(struct vcpu *v); + /* Called in boot/resume paths. Must cope with no HVM support. */ static inline int hvm_cpu_up(void) { diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 35e929057d21..7f2666adaef4 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6367,6 +6367,14 @@ void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt) l4e_write(&root_pgt[root_table_offset(PERDOMAIN_VIRT_START)], l4e_from_page(v->domain->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW)); + + if ( !is_pv_64bit_vcpu(v) ) + /* + * HVM guests always have the compatibility L4 per-domain area because + * bitness is not know, and can change at runtime. + */ + l4e_write(&root_pgt[root_table_offset(PERDOMAIN_ALT_VIRT_START)], + root_pgt[root_table_offset(PERDOMAIN_VIRT_START)]); } static void __init __maybe_unused build_assertions(void) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index c8514ca0e917..3279aafcd7d8 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -387,46 +387,6 @@ int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted) return 0; } -static mfn_t hap_make_monitor_table(struct vcpu *v) -{ - struct domain *d = v->domain; - struct page_info *pg; - l4_pgentry_t *l4e; - mfn_t m4mfn; - - ASSERT(pagetable_get_pfn(v->arch.hvm.monitor_table) == 0); - - if ( (pg = hap_alloc(d)) == NULL ) - goto oom; - - m4mfn = page_to_mfn(pg); - l4e = map_domain_page(m4mfn); - - init_xen_l4_slots(l4e, m4mfn, INVALID_MFN, d->arch.perdomain_l3_pg, - false, true, false); - unmap_domain_page(l4e); - - return m4mfn; - - oom: - if ( !d->is_dying && - (!d->is_shutting_down || d->shutdown_code != SHUTDOWN_crash) ) - { - printk(XENLOG_G_ERR "%pd: out of memory building monitor pagetable\n", - d); - domain_crash(d); - } - return INVALID_MFN; -} - -static void hap_destroy_monitor_table(struct vcpu* v, mfn_t mmfn) -{ - struct domain *d = v->domain; - - /* Put the memory back in the pool */ - hap_free(d, mmfn); -} - /************************************************/ /* HAP DOMAIN LEVEL FUNCTIONS */ /************************************************/ @@ -548,25 +508,6 @@ void hap_final_teardown(struct domain *d) } } -void hap_vcpu_teardown(struct vcpu *v) -{ - struct domain *d = v->domain; - mfn_t mfn; - - paging_lock(d); - - if ( !paging_mode_hap(d) || !v->arch.paging.mode ) - goto out; - - mfn = pagetable_get_mfn(v->arch.hvm.monitor_table); - if ( mfn_x(mfn) ) - hap_destroy_monitor_table(v, mfn); - v->arch.hvm.monitor_table = pagetable_null(); - - out: - paging_unlock(d); -} - void hap_teardown(struct domain *d, bool *preempted) { struct vcpu *v; @@ -575,10 +516,6 @@ void hap_teardown(struct domain *d, bool *preempted) ASSERT(d->is_dying); ASSERT(d != current->domain); - /* TODO - Remove when the teardown path is better structured. */ - for_each_vcpu ( d, v ) - hap_vcpu_teardown(v); - /* Leave the root pt in case we get further attempts to modify the p2m. */ if ( hvm_altp2m_supported() ) { @@ -782,21 +719,9 @@ static void cf_check hap_update_paging_modes(struct vcpu *v) v->arch.paging.mode = hap_paging_get_mode(v); - if ( pagetable_is_null(v->arch.hvm.monitor_table) ) - { - mfn_t mmfn = hap_make_monitor_table(v); - - if ( mfn_eq(mmfn, INVALID_MFN) ) - goto unlock; - v->arch.hvm.monitor_table = pagetable_from_mfn(mmfn); - make_cr3(v, mmfn); - hvm_update_host_cr3(v); - } - /* CR3 is effectively updated by a mode change. Flush ASIDs, etc. */ hap_update_cr3(v, false); - unlock: paging_unlock(d); put_gfn(d, cr3_gfn); } diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index bca320fffabf..8ba105b5cb0c 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -794,9 +794,7 @@ long do_paging_domctl_cont( void paging_vcpu_teardown(struct vcpu *v) { - if ( hap_enabled(v->domain) ) - hap_vcpu_teardown(v); - else + if ( !hap_enabled(v->domain) ) shadow_vcpu_teardown(v); } From patchwork Fri Jul 26 15:21:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1E2D1C3DA49 for ; Fri, 26 Jul 2024 15:41:46 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765661.1176317 (Exim 4.92) (envelope-from ) id 1sXN4b-0004pP-OK; Fri, 26 Jul 2024 15:41:37 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765661.1176317; Fri, 26 Jul 2024 15:41:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN4b-0004pI-KA; Fri, 26 Jul 2024 15:41:37 +0000 Received: by outflank-mailman (input) for mailman id 765661; Fri, 26 Jul 2024 15:41:35 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvL-00084Z-Ik for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:03 +0000 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [2607:f8b0:4864:20::835]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 3216d2c5-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:31:58 +0200 (CEST) Received: by mail-qt1-x835.google.com with SMTP id d75a77b69052e-44ff7cc5432so5018541cf.3 for ; Fri, 26 Jul 2024 08:31:58 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe8201583sm13962251cf.69.2024.07.26.08.31.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:56 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3216d2c5-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007917; x=1722612717; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y0jFmKsOaRzFzsuDJr/vpzxDf870ieWIqAEP938hVyw=; b=M7Qeu1AAlr0bQG8bOphO1sdcojnhZZLTcmLZp1Z8T+qyHIZ3L32hi40lz8Dx+apTol wwtj4pTT4IU4v/Q0T2N1SPwX2wcCQgd65sK+1GpcKjiJwy+KO4tcH9TilJbsqbBfSgP9 IPjc6HO62lw93TFytJoVqfD5MJ5Ln5pcDkv5U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007917; x=1722612717; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y0jFmKsOaRzFzsuDJr/vpzxDf870ieWIqAEP938hVyw=; b=xNhhr2hZwqIXClJVPSAFTpKMfaCcWrAgH44XtF2eLso/79fMcxU98Fu8xrh3NGVHKm Taw0MnjWf7+v5YHUJY8JCOd36NDu46AtKb4AakEERzqFYvpdyDRMA+c+cJaKJ9LA+8tv 2+7HR21adF86zOIetTCTD1Nu+G5GG9nKfAPl7cCQ09zajnb+Z6LT2LCbSAVitv81/3wD 0GRGO/U+wEDOhk0l1x6QYA0vBCwrQqf+CJA+oo6bw5LfwVOLJMHMJTnQ8hifhBO41FIJ UaTSK43IO8lEZqlix4L0lDMPB9H+ip1PcJyR1oVEPMNQP4ozJEy23HT42v+n8aHNFdrc sy7A== X-Gm-Message-State: AOJu0YzKckbeIHrjeNUD7G8AuxoaJfUzDFAuuQb2ol9YogKBJwuEUABY sN/Hc/rLTM371r7lF/4KCo1/wXRsEy3ksEVGDaeZl+6mT8Scwd53IjgBB9FjEXoLjb46ClBdxgC G X-Google-Smtp-Source: AGHT+IGiS/h6u9qqSHvgUrluH594A0zTw0s1tnDElj/8eDjyLad+rVrps4h28Kg+Rh5qgtG1Fnzf4g== X-Received: by 2002:a05:622a:547:b0:44f:f7be:4d3e with SMTP id d75a77b69052e-45004d6fd41mr2322221cf.5.1722007917022; Fri, 26 Jul 2024 08:31:57 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper , Tim Deegan Subject: [PATCH 14/22] x86/hvm: use a per-pCPU monitor table in shadow mode Date: Fri, 26 Jul 2024 17:21:58 +0200 Message-ID: <20240726152206.28411-15-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Instead of allocating a monitor table for each vCPU when running in HVM shadow mode, use a per-pCPU monitor table, which gets the per-domain slot updated on guest context switch. Signed-off-by: Roger Pau Monné --- I've tested this manually, but XenServer builds disable shadow support, so it possibly hasn't been given the same level of testing as the rest of the changes. --- xen/arch/x86/hvm/hvm.c | 7 +++ xen/arch/x86/include/asm/hvm/vcpu.h | 6 ++- xen/arch/x86/include/asm/paging.h | 18 ++++++++ xen/arch/x86/mm.c | 6 +++ xen/arch/x86/mm/shadow/common.c | 42 +++++++----------- xen/arch/x86/mm/shadow/hvm.c | 65 ++++++++++++---------------- xen/arch/x86/mm/shadow/multi.c | 66 ++++++++++++++++++----------- xen/arch/x86/mm/shadow/private.h | 4 +- 8 files changed, 120 insertions(+), 94 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 3f771bc65677..419d78a79c51 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -141,6 +141,7 @@ void hvm_set_cpu_monitor_table(struct vcpu *v) ASSERT(pgt); + paging_set_cpu_monitor_table(v); setup_perdomain_slot(v, pgt); make_cr3(v, _mfn(virt_to_mfn(pgt))); @@ -150,6 +151,8 @@ void hvm_clear_cpu_monitor_table(struct vcpu *v) { /* Poison %cr3, it will be updated when the vCPU is scheduled. */ make_cr3(v, INVALID_MFN); + + paging_clear_cpu_monitor_table(v); } static int cf_check cpu_callback( @@ -1645,6 +1648,10 @@ int hvm_vcpu_initialise(struct vcpu *v) int rc; struct domain *d = v->domain; +#ifdef CONFIG_SHADOW_PAGING + v->arch.hvm.shadow_linear_l3 = INVALID_MFN; +#endif + hvm_asid_flush_vcpu(v); spin_lock_init(&v->arch.hvm.tm_lock); diff --git a/xen/arch/x86/include/asm/hvm/vcpu.h b/xen/arch/x86/include/asm/hvm/vcpu.h index 64c7a6fedea9..f7faaaa21521 100644 --- a/xen/arch/x86/include/asm/hvm/vcpu.h +++ b/xen/arch/x86/include/asm/hvm/vcpu.h @@ -149,8 +149,10 @@ struct hvm_vcpu { uint16_t p2midx; } fast_single_step; - /* (MFN) hypervisor page table */ - pagetable_t monitor_table; +#ifdef CONFIG_SHADOW_PAGING + /* Reference to the linear L3 page table. */ + mfn_t shadow_linear_l3; +#endif struct hvm_vcpu_asid n1asid; diff --git a/xen/arch/x86/include/asm/paging.h b/xen/arch/x86/include/asm/paging.h index 8a2a0af40874..c1e188bcd3c0 100644 --- a/xen/arch/x86/include/asm/paging.h +++ b/xen/arch/x86/include/asm/paging.h @@ -117,6 +117,8 @@ struct paging_mode { unsigned long cr3, paddr_t ga, uint32_t *pfec, unsigned int *page_order); + void (*set_cpu_monitor_table )(struct vcpu *v); + void (*clear_cpu_monitor_table)(struct vcpu *v); #endif pagetable_t (*update_cr3 )(struct vcpu *v, bool noflush); @@ -288,6 +290,22 @@ static inline bool paging_flush_tlb(const unsigned long *vcpu_bitmap) return current->domain->arch.paging.flush_tlb(vcpu_bitmap); } +static inline void paging_set_cpu_monitor_table(struct vcpu *v) +{ + const struct paging_mode *mode = paging_get_hostmode(v); + + if ( mode->set_cpu_monitor_table ) + mode->set_cpu_monitor_table(v); +} + +static inline void paging_clear_cpu_monitor_table(struct vcpu *v) +{ + const struct paging_mode *mode = paging_get_hostmode(v); + + if ( mode->clear_cpu_monitor_table ) + mode->clear_cpu_monitor_table(v); +} + #endif /* CONFIG_HVM */ /* Update all the things that are derived from the guest's CR3. diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 7f2666adaef4..13aa15f4db22 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -534,6 +534,12 @@ void write_ptbase(struct vcpu *v) } else { + ASSERT(!is_hvm_domain(d) || !d->arch.asi +#ifdef CONFIG_HVM + || mfn_eq(maddr_to_mfn(v->arch.cr3), + virt_to_mfn(this_cpu(monitor_pgt))) +#endif + ); /* Make sure to clear use_pv_cr3 and xen_cr3 before pv_cr3. */ cpu_info->use_pv_cr3 = false; cpu_info->xen_cr3 = 0; diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index 0176e33bc9c7..d31c1db8a1ab 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -2413,16 +2413,12 @@ static void sh_update_paging_modes(struct vcpu *v) &SHADOW_INTERNAL_NAME(sh_paging_mode, 2); } - if ( pagetable_is_null(v->arch.hvm.monitor_table) ) + if ( mfn_eq(v->arch.hvm.shadow_linear_l3, INVALID_MFN) ) { - mfn_t mmfn = sh_make_monitor_table( - v, v->arch.paging.mode->shadow.shadow_levels); - - if ( mfn_eq(mmfn, INVALID_MFN) ) + if ( sh_update_monitor_table( + v, v->arch.paging.mode->shadow.shadow_levels) ) return; - v->arch.hvm.monitor_table = pagetable_from_mfn(mmfn); - make_cr3(v, mmfn); hvm_update_host_cr3(v); } @@ -2440,8 +2436,8 @@ static void sh_update_paging_modes(struct vcpu *v) (v->arch.paging.mode->shadow.shadow_levels != old_mode->shadow.shadow_levels) ) { - /* Need to make a new monitor table for the new mode */ - mfn_t new_mfn, old_mfn; + /* Might need to make a new L3 linear table for the new mode */ + mfn_t old_mfn; if ( v != current && vcpu_runnable(v) ) { @@ -2455,24 +2451,21 @@ static void sh_update_paging_modes(struct vcpu *v) return; } - old_mfn = pagetable_get_mfn(v->arch.hvm.monitor_table); - v->arch.hvm.monitor_table = pagetable_null(); - new_mfn = sh_make_monitor_table( - v, v->arch.paging.mode->shadow.shadow_levels); - if ( mfn_eq(new_mfn, INVALID_MFN) ) + old_mfn = v->arch.hvm.shadow_linear_l3; + v->arch.hvm.shadow_linear_l3 = INVALID_MFN; + if ( sh_update_monitor_table( + v, v->arch.paging.mode->shadow.shadow_levels) ) { sh_destroy_monitor_table(v, old_mfn, old_mode->shadow.shadow_levels); return; } - v->arch.hvm.monitor_table = pagetable_from_mfn(new_mfn); - SHADOW_PRINTK("new monitor table %"PRI_mfn "\n", - mfn_x(new_mfn)); + SHADOW_PRINTK("new L3 linear table %"PRI_mfn "\n", + mfn_x(v->arch.hvm.shadow_linear_l3)); /* Don't be running on the old monitor table when we * pull it down! Switch CR3, and warn the HVM code that * its host cr3 has changed. */ - make_cr3(v, new_mfn); if ( v == current ) write_ptbase(v); hvm_update_host_cr3(v); @@ -2781,16 +2774,13 @@ void shadow_vcpu_teardown(struct vcpu *v) sh_detach_old_tables(v); #ifdef CONFIG_HVM - if ( shadow_mode_external(d) ) + if ( shadow_mode_external(d) && + !mfn_eq(v->arch.hvm.shadow_linear_l3, INVALID_MFN) ) { - mfn_t mfn = pagetable_get_mfn(v->arch.hvm.monitor_table); - - if ( mfn_x(mfn) ) - sh_destroy_monitor_table( - v, mfn, + sh_destroy_monitor_table( + v, v->arch.hvm.shadow_linear_l3, v->arch.paging.mode->shadow.shadow_levels); - - v->arch.hvm.monitor_table = pagetable_null(); + v->arch.hvm.shadow_linear_l3 = INVALID_MFN; } #endif diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c index 93922a71e511..15c75cf766bb 100644 --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -736,30 +736,15 @@ bool cf_check shadow_flush_tlb(const unsigned long *vcpu_bitmap) return true; } -mfn_t sh_make_monitor_table(const struct vcpu *v, unsigned int shadow_levels) +int sh_update_monitor_table(struct vcpu *v, unsigned int shadow_levels) { struct domain *d = v->domain; - mfn_t m4mfn; - l4_pgentry_t *l4e; - ASSERT(!pagetable_get_pfn(v->arch.hvm.monitor_table)); + ASSERT(mfn_eq(v->arch.hvm.shadow_linear_l3, INVALID_MFN)); /* Guarantee we can get the memory we need */ - if ( !shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS) ) - return INVALID_MFN; - - m4mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m4mfn)->shadow_flags = 4; - - l4e = map_domain_page(m4mfn); - - /* - * Create a self-linear mapping, but no shadow-linear mapping. A - * shadow-linear mapping will either be inserted below when creating - * lower level monitor tables, or later in sh_update_cr3(). - */ - init_xen_l4_slots(l4e, m4mfn, INVALID_MFN, d->arch.perdomain_l3_pg, - false, true, false); + if ( !shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS - 1) ) + return -ENOMEM; if ( shadow_levels < 4 ) { @@ -773,52 +758,54 @@ mfn_t sh_make_monitor_table(const struct vcpu *v, unsigned int shadow_levels) */ m3mfn = shadow_alloc(d, SH_type_monitor_table, 0); mfn_to_page(m3mfn)->shadow_flags = 3; - l4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] - = l4e_from_mfn(m3mfn, __PAGE_HYPERVISOR_RW); m2mfn = shadow_alloc(d, SH_type_monitor_table, 0); mfn_to_page(m2mfn)->shadow_flags = 2; l3e = map_domain_page(m3mfn); l3e[0] = l3e_from_mfn(m2mfn, __PAGE_HYPERVISOR_RW); unmap_domain_page(l3e); - } - unmap_domain_page(l4e); + v->arch.hvm.shadow_linear_l3 = m3mfn; + + /* + * If the vCPU is not the current one the L4 entry will be updated on + * context switch. + */ + if ( v == current ) + this_cpu(monitor_pgt)[l4_table_offset(SH_LINEAR_PT_VIRT_START)] + = l4e_from_mfn(m3mfn, __PAGE_HYPERVISOR_RW); + } + else if ( v == current ) + /* The shadow linear mapping will be inserted in sh_update_cr3(). */ + this_cpu(monitor_pgt)[l4_table_offset(SH_LINEAR_PT_VIRT_START)] + = l4e_empty(); - return m4mfn; + return 0; } -void sh_destroy_monitor_table(const struct vcpu *v, mfn_t mmfn, +void sh_destroy_monitor_table(const struct vcpu *v, mfn_t m3mfn, unsigned int shadow_levels) { struct domain *d = v->domain; - ASSERT(mfn_to_page(mmfn)->u.sh.type == SH_type_monitor_table); - if ( shadow_levels < 4 ) { - mfn_t m3mfn; - l4_pgentry_t *l4e = map_domain_page(mmfn); - l3_pgentry_t *l3e; - unsigned int linear_slot = l4_table_offset(SH_LINEAR_PT_VIRT_START); + l3_pgentry_t *l3e = map_domain_page(m3mfn); + + ASSERT(!mfn_eq(m3mfn, INVALID_MFN)); + ASSERT(mfn_to_page(m3mfn)->u.sh.type == SH_type_monitor_table); /* * Need to destroy the l3 and l2 monitor pages used * for the linear map. */ - ASSERT(l4e_get_flags(l4e[linear_slot]) & _PAGE_PRESENT); - m3mfn = l4e_get_mfn(l4e[linear_slot]); - l3e = map_domain_page(m3mfn); ASSERT(l3e_get_flags(l3e[0]) & _PAGE_PRESENT); shadow_free(d, l3e_get_mfn(l3e[0])); unmap_domain_page(l3e); shadow_free(d, m3mfn); - - unmap_domain_page(l4e); } - - /* Put the memory back in the pool */ - shadow_free(d, mmfn); + else + ASSERT(mfn_eq(m3mfn, INVALID_MFN)); } /**************************************************************************/ diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index 0def0c073ca8..68c59233794f 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -3007,6 +3007,32 @@ static unsigned long cf_check sh_gva_to_gfn( return gfn_x(gfn); } +static void cf_check set_cpu_monitor_table(struct vcpu *v) +{ + root_pgentry_t *pgt = this_cpu(monitor_pgt); + + virt_to_page(pgt)->shadow_flags = 4; + + /* Setup linear L3 entry. */ + if ( !mfn_eq(v->arch.hvm.shadow_linear_l3, INVALID_MFN) ) + pgt[l4_table_offset(SH_LINEAR_PT_VIRT_START)] = + l4e_from_mfn(v->arch.hvm.shadow_linear_l3, __PAGE_HYPERVISOR_RW); + else + pgt[l4_table_offset(SH_LINEAR_PT_VIRT_START)] = + l4e_from_pfn( + pagetable_get_pfn(v->arch.paging.shadow.shadow_table[0]), + __PAGE_HYPERVISOR_RW); +} + +static void cf_check clear_cpu_monitor_table(struct vcpu *v) +{ + root_pgentry_t *pgt = this_cpu(monitor_pgt); + + virt_to_page(pgt)->shadow_flags = 0; + + pgt[l4_table_offset(SH_LINEAR_PT_VIRT_START)] = l4e_empty(); +} + #endif /* CONFIG_HVM */ static inline void @@ -3033,8 +3059,11 @@ sh_update_linear_entries(struct vcpu *v) */ /* Don't try to update the monitor table if it doesn't exist */ - if ( !shadow_mode_external(d) || - pagetable_get_pfn(v->arch.hvm.monitor_table) == 0 ) + if ( !shadow_mode_external(d) +#if SHADOW_PAGING_LEVELS == 3 + || mfn_eq(v->arch.hvm.shadow_linear_l3, INVALID_MFN) +#endif + ) return; #if !defined(CONFIG_HVM) @@ -3051,17 +3080,6 @@ sh_update_linear_entries(struct vcpu *v) pagetable_get_pfn(v->arch.paging.shadow.shadow_table[0]), __PAGE_HYPERVISOR_RW); } - else - { - l4_pgentry_t *ml4e; - - ml4e = map_domain_page(pagetable_get_mfn(v->arch.hvm.monitor_table)); - ml4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] = - l4e_from_pfn( - pagetable_get_pfn(v->arch.paging.shadow.shadow_table[0]), - __PAGE_HYPERVISOR_RW); - unmap_domain_page(ml4e); - } #elif SHADOW_PAGING_LEVELS == 3 @@ -3087,16 +3105,8 @@ sh_update_linear_entries(struct vcpu *v) + l2_linear_offset(SH_LINEAR_PT_VIRT_START); else { - mfn_t l3mfn, l2mfn; - l4_pgentry_t *ml4e; - l3_pgentry_t *ml3e; - int linear_slot = shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START); - ml4e = map_domain_page(pagetable_get_mfn(v->arch.hvm.monitor_table)); - - ASSERT(l4e_get_flags(ml4e[linear_slot]) & _PAGE_PRESENT); - l3mfn = l4e_get_mfn(ml4e[linear_slot]); - ml3e = map_domain_page(l3mfn); - unmap_domain_page(ml4e); + mfn_t l2mfn; + l3_pgentry_t *ml3e = map_domain_page(v->arch.hvm.shadow_linear_l3); ASSERT(l3e_get_flags(ml3e[0]) & _PAGE_PRESENT); l2mfn = l3e_get_mfn(ml3e[0]); @@ -3341,9 +3351,13 @@ static pagetable_t cf_check sh_update_cr3(struct vcpu *v, bool noflush) /// /// v->arch.cr3 /// - if ( shadow_mode_external(d) ) + if ( shadow_mode_external(d) && v == current ) { - make_cr3(v, pagetable_get_mfn(v->arch.hvm.monitor_table)); +#ifdef CONFIG_HVM + make_cr3(v, _mfn(virt_to_mfn(this_cpu(monitor_pgt)))); +#else + ASSERT_UNREACHABLE(); +#endif } #if SHADOW_PAGING_LEVELS == 4 else // not shadow_mode_external... @@ -4106,6 +4120,8 @@ const struct paging_mode sh_paging_mode = { .invlpg = sh_invlpg, #ifdef CONFIG_HVM .gva_to_gfn = sh_gva_to_gfn, + .set_cpu_monitor_table = set_cpu_monitor_table, + .clear_cpu_monitor_table = clear_cpu_monitor_table, #endif .update_cr3 = sh_update_cr3, .guest_levels = GUEST_PAGING_LEVELS, diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h index a5fc3a7676eb..6743aeefe12e 100644 --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -420,8 +420,8 @@ void shadow_unhook_mappings(struct domain *d, mfn_t smfn, int user_only); * sh_{make,destroy}_monitor_table() depend only on the number of shadow * levels. */ -mfn_t sh_make_monitor_table(const struct vcpu *v, unsigned int shadow_levels); -void sh_destroy_monitor_table(const struct vcpu *v, mfn_t mmfn, +int sh_update_monitor_table(struct vcpu *v, unsigned int shadow_levels); +void sh_destroy_monitor_table(const struct vcpu *v, mfn_t m3mfn, unsigned int shadow_levels); /* VRAM dirty tracking helpers. */ From patchwork Fri Jul 26 15:21:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 91013C3DA49 for ; Fri, 26 Jul 2024 15:37:30 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765568.1176247 (Exim 4.92) (envelope-from ) id 1sXN0S-0007Yp-IU; Fri, 26 Jul 2024 15:37:20 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765568.1176247; Fri, 26 Jul 2024 15:37:20 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN0S-0007Yi-EE; Fri, 26 Jul 2024 15:37:20 +0000 Received: by outflank-mailman (input) for mailman id 765568; Fri, 26 Jul 2024 15:37:19 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvL-00084T-UE for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:03 +0000 Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [2607:f8b0:4864:20::22e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 33486385-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:32:00 +0200 (CEST) Received: by mail-oi1-x22e.google.com with SMTP id 5614622812f47-3db157d3bb9so641892b6e.2 for ; Fri, 26 Jul 2024 08:32:00 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8fb75esm17647636d6.37.2024.07.26.08.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:31:58 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 33486385-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007919; x=1722612719; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HYXdVkLLNIS4KtspHlJSKYzWQbuXp954bdTpVDbEoAM=; b=rNx1e2VONFwODEt63rW9CKplpEYgHObqHmkaK5EEUaKiXi+oJ/UlFo6wVetymujbXV dAw0e5uhdUZX+Mxt+Z2y8SQ9F4usadtehGX01H0AOMNQHOngK+/Lqxs3X75pHfgct8N8 eRAct8FyWP5rqqNV8UCGWIbSD1ljuxgJWmrCw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007919; x=1722612719; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HYXdVkLLNIS4KtspHlJSKYzWQbuXp954bdTpVDbEoAM=; b=EzCif13BYp/1aPpzCNfUpj77ElztRDK+q7ym3IxJv2OajoX+Oj+N9PHKXdXYLrtIXW szDOblZiiUuOrSk7v9OWB0b3b5Nhz7MeP44iDDgr+0U6bCSyOgkxX/YOHwZubrBEHs4q 0B4OTwFpEKM8PVyjkmCedK14ZU1bhz0pPTlFrpRoM8NcyqkLRplyauIXsggQa/bbDDle uDYMHvtDThqxyIab0tfuPcH8HGs2uF5FdE3G5QkmusDSSY2gohtTt7xru/TEz+6E4wOK LVEl++3jWBpjfSi0YBPcX8Koa7QmV8rbZeiB3TudiZ5RB/A92P/GiE5gK9hWdx2sKyIs TSIA== X-Gm-Message-State: AOJu0YxxeMy6YbAbDDNtGKBnYTnBxy0ElnADDX0Tx6ZaL+QC12YhlNVE dBQsX/s1X3P53OyYjpPG5g8koOtKoTwelvBBK21o7DHhf6f9Tk2kTbN1teKrOEEhIiViZLQxJbd z X-Google-Smtp-Source: AGHT+IELQuiKg0dTrgg6mKY/L8iQGU8ZCA7DqdUymXnvFtIjJNaHMTKjAvlY61OEYN4a/0/7vnN8Cg== X-Received: by 2002:a05:6808:16a7:b0:3d9:37cf:3cad with SMTP id 5614622812f47-3db23a812a4mr37700b6e.49.1722007919026; Fri, 26 Jul 2024 08:31:59 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 15/22] x86/idle: allow using a per-pCPU L4 Date: Fri, 26 Jul 2024 17:21:59 +0200 Message-ID: <20240726152206.28411-16-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Introduce support for possibly using a different L4 across the idle vCPUs. This change only introduces support for loading a per-pPCU idle L4, but even with the per-CPU idle page-table enabled it should still be a clone of idle_pg_table, hence no functional change expected. Note the idle L4 is not changed after Xen has reached the SYS_STATE_smp_boot state, hence there are no need to synchronize the contents of the L4 once the CPUs are started. Using a per-CPU idle page-table is not strictly required for the Address Space Isolation work, as idle page tables are never used when running guests. However it simplifies memory management of the per-CPU mappings, as creating per-CPU mappings only require using the idle page-table of the CPU where the mappings should be created. Signed-off-by: Roger Pau Monné --- xen/arch/x86/boot/x86_64.S | 11 +++++++++++ xen/arch/x86/domain.c | 20 +++++++++++++++++++- xen/arch/x86/domain_page.c | 2 +- xen/arch/x86/include/asm/setup.h | 1 + xen/arch/x86/setup.c | 3 +++ xen/arch/x86/smpboot.c | 7 +++++++ 6 files changed, 42 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S index 04bb62ae8680..af7854820185 100644 --- a/xen/arch/x86/boot/x86_64.S +++ b/xen/arch/x86/boot/x86_64.S @@ -15,6 +15,17 @@ ENTRY(__high_start) mov $XEN_MINIMAL_CR4,%rcx mov %rcx,%cr4 + /* + * Possibly switch to the per-CPU idle page-tables. Note we cannot + * switch earlier as the per-CPU page-tables might be above 4G, and + * hence need to load them from 64bit code. + */ + mov ap_cr3(%rip), %rax + test %rax, %rax + jz .L_skip_cr3 + mov %rax, %cr3 +.L_skip_cr3: + mov stack_start(%rip),%rsp /* Reset EFLAGS (subsumes CLI and CLD). */ diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 9cfcf0dc63f3..b62c4311da6c 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -555,6 +555,7 @@ void arch_vcpu_regs_init(struct vcpu *v) int arch_vcpu_create(struct vcpu *v) { struct domain *d = v->domain; + root_pgentry_t *pgt = NULL; int rc; v->arch.flags = TF_kernel_mode; @@ -589,7 +590,23 @@ int arch_vcpu_create(struct vcpu *v) else { /* Idle domain */ - v->arch.cr3 = __pa(idle_pg_table); + if ( (opt_asi_pv || opt_asi_hvm) && v->vcpu_id ) + { + pgt = alloc_xenheap_page(); + + /* + * For the idle vCPU 0 (the BSP idle vCPU) use idle_pg_table + * directly, there's no need to create yet another copy. + */ + rc = -ENOMEM; + if ( !pgt ) + goto fail; + + copy_page(pgt, idle_pg_table); + v->arch.cr3 = __pa(pgt); + } + else + v->arch.cr3 = __pa(idle_pg_table); rc = 0; v->arch.msrs = ZERO_BLOCK_PTR; /* Catch stray misuses */ } @@ -611,6 +628,7 @@ int arch_vcpu_create(struct vcpu *v) vcpu_destroy_fpu(v); xfree(v->arch.msrs); v->arch.msrs = NULL; + free_xenheap_page(pgt); return rc; } diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index eac5e3304fb8..99b78af90fd3 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -51,7 +51,7 @@ static inline struct vcpu *mapcache_current_vcpu(void) if ( (v = idle_vcpu[smp_processor_id()]) == current ) sync_local_execstate(); /* We must now be running on the idle page table. */ - ASSERT(cr3_pa(read_cr3()) == __pa(idle_pg_table)); + ASSERT(cr3_pa(read_cr3()) == cr3_pa(v->arch.cr3)); } return v; diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h index d75589178b91..a8452fce8f05 100644 --- a/xen/arch/x86/include/asm/setup.h +++ b/xen/arch/x86/include/asm/setup.h @@ -14,6 +14,7 @@ extern unsigned long xenheap_initial_phys_start; extern uint64_t boot_tsc_stamp; extern void *stack_start; +extern unsigned long ap_cr3; void early_cpu_init(bool verbose); void early_time_init(void); diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index bc387d96b519..c5a13b30daf4 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -158,6 +158,9 @@ char asmlinkage __section(".init.bss.stack_aligned") __aligned(STACK_SIZE) /* Used by the BSP/AP paths to find the higher half stack mapping to use. */ void *stack_start = cpu0_stack + STACK_SIZE - sizeof(struct cpu_info); +/* cr3 value for the AP to load on boot. */ +unsigned long ap_cr3; + /* Used by the boot asm to stash the relocated multiboot info pointer. */ unsigned int asmlinkage __initdata multiboot_ptr; diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 8aa621533f3d..e07add36b1b6 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -581,6 +581,13 @@ static int do_boot_cpu(int apicid, int cpu) stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); + /* + * If per-CPU idle root page table has been allocated, switch to it as + * part of the AP bringup trampoline. + */ + ap_cr3 = idle_vcpu[cpu]->arch.cr3 != __pa(idle_pg_table) ? + idle_vcpu[cpu]->arch.cr3 : 0; + /* This grunge runs the startup process for the targeted processor. */ set_cpu_state(CPU_STATE_INIT); From patchwork Fri Jul 26 15:22:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 179A2C3DA4A for ; Fri, 26 Jul 2024 15:37:30 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765570.1176252 (Exim 4.92) (envelope-from ) id 1sXN0S-0007c4-QJ; Fri, 26 Jul 2024 15:37:20 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765570.1176252; Fri, 26 Jul 2024 15:37:20 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN0S-0007bJ-LF; Fri, 26 Jul 2024 15:37:20 +0000 Received: by outflank-mailman (input) for mailman id 765570; Fri, 26 Jul 2024 15:37:20 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvN-00084T-UO for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:05 +0000 Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [2607:f8b0:4864:20::f2b]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 34934c3e-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:32:02 +0200 (CEST) Received: by mail-qv1-xf2b.google.com with SMTP id 6a1803df08f44-6b7b28442f9so9058466d6.3 for ; Fri, 26 Jul 2024 08:32:02 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3f8d82a1sm17586896d6.8.2024.07.26.08.32.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:00 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 34934c3e-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007921; x=1722612721; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wcKEcIFuRVzpYWYIIO7wusSFWSLDiglqIhdKhUx/dZ0=; b=pIliK6jM8ikuchh9GUCPAVsgDp3CyuTuzPpJqEQMb8N8yfXSLvlRmaF66XGpEew/rB 3Fbjo4QE3HHSD1vAk97HDXVJ+UNynJx1UkouwUrJmLtOm+ymTZ3kh8mav2Z6DxCPOQxQ Rh1n6q9ueE84GmXJFHz548PzofuMr78MRRG9k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007921; x=1722612721; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wcKEcIFuRVzpYWYIIO7wusSFWSLDiglqIhdKhUx/dZ0=; b=igFZNNUdwaInFFuZ7aF6wmGHTnN6Us/SihjebS2JrvdiqyPqv5JGRRkXS1qLRj1+0O p0rEt6rXlDzjNWJoqmKq1S7cSqqHE0xb7aFq2fWa+gjmL8IY310oUS35XO8Uw7oj2PId 4piMIwQ14sWISiQXlHXwlMaFtZgrpvzGH13qehqamExVcdHeg1YN4/KStNi1coLkoe4u p9phZtvbrqLZVnBPqqmwirTmb85AsWm97HP0sUB7Es5xOV0gfBvPfrNoNJ54g9Q0lWBK Z6agMNvmIZrUKp7Pxl5wlWj1LUkil18VzgcvY1x+eLN8eew4nLS+7f/VAfxA3gJr08KQ XIew== X-Gm-Message-State: AOJu0YzoQSEOqoBu+TMh6vvsrTA1XpuZJfnUrgT9YbZAW6fKSjTt9AXY FdqIU8P5Ifl4am8N5MQwM87OOEA7VH9khYNuDnoMXNEwl2UB3S+ZYKD/cs+3L8yioPcaoIzsJm0 9 X-Google-Smtp-Source: AGHT+IHqAu8Qz1BY8yZmp87C+3Fu28qPJ50Yvmj36fS5QokP/EOXTjLUGudnvfM7GSui/Mp4Dy/a3g== X-Received: by 2002:ad4:5e85:0:b0:6b0:90b4:1cb3 with SMTP id 6a1803df08f44-6bb55a7eef3mr1897296d6.32.1722007921198; Fri, 26 Jul 2024 08:32:01 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 16/22] x86/mm: introduce a per-CPU L3 table for the per-domain slot Date: Fri, 26 Jul 2024 17:22:00 +0200 Message-ID: <20240726152206.28411-17-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 So far L4 slot 260 has always been per-domain, in other words: all vCPUs of a domain share the same L3 entry. Currently only 3 slots are used in that L3 table, which leaves plenty of room. Introduce a per-CPU L3 that's used the the domain has Address Space Isolation enabled. Such per-CPU L3 gets currently populated using the same L3 entries present on the per-domain L3 (d->arch.perdomain_l3_pg). No functional change expected, as the per-CPU L3 is always a copy of the contents of d->arch.perdomain_l3_pg. Note that all the per-domain L3 entries are populated at domain create, and hence there's no need to sync the state of the per-CPU L3 as the domain won't yet be running when the L3 is modified. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 2 + xen/arch/x86/include/asm/mm.h | 4 ++ xen/arch/x86/mm.c | 80 +++++++++++++++++++++++++++++-- xen/arch/x86/setup.c | 8 ++++ xen/arch/x86/smpboot.c | 4 ++ 5 files changed, 95 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index 8c366be8c75f..7620a352b9e3 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -313,6 +313,8 @@ struct arch_domain { struct page_info *perdomain_l3_pg; + struct page_info *perdomain_l2_pgs[PERDOMAIN_SLOTS]; + #ifdef CONFIG_PV32 unsigned int hv_compat_vstart; #endif diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 2c309f7b1444..34407fb0af06 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -633,4 +633,8 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr) /* Setup the per-domain slot in the root page table pointer. */ void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt); +/* Allocate a per-CPU local L3 table to use in the per-domain slot. */ +int allocate_perdomain_local_l3(unsigned int cpu); +void free_perdomain_local_l3(unsigned int cpu); + #endif /* __ASM_X86_MM_H__ */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 13aa15f4db22..1367f3361ffe 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6079,6 +6079,12 @@ int create_perdomain_mapping(struct domain *d, unsigned long va, l2tab = __map_domain_page(pg); clear_page(l2tab); l3tab[l3_table_offset(va)] = l3e_from_page(pg, __PAGE_HYPERVISOR_RW); + /* + * Keep a reference to the per-domain L3 entries in case a per-CPU L3 + * is in use (as opposed to using perdomain_l3_pg). + */ + ASSERT(!d->creation_finished); + d->arch.perdomain_l2_pgs[l3_table_offset(va)] = pg; } else l2tab = map_l2t_from_l3e(l3tab[l3_table_offset(va)]); @@ -6368,11 +6374,79 @@ unsigned long get_upper_mfn_bound(void) return min(max_mfn, 1UL << (paddr_bits - PAGE_SHIFT)) - 1; } +static DEFINE_PER_CPU(l3_pgentry_t *, local_l3); + +static void populate_perdomain(const struct domain *d, l4_pgentry_t *l4, + l3_pgentry_t *l3) +{ + unsigned int i; + + /* Populate the per-CPU L3 with the per-domain entries. */ + for ( i = 0; i < ARRAY_SIZE(d->arch.perdomain_l2_pgs); i++ ) + { + const struct page_info *pg = d->arch.perdomain_l2_pgs[i]; + + BUILD_BUG_ON(ARRAY_SIZE(d->arch.perdomain_l2_pgs) > + L3_PAGETABLE_ENTRIES); + l3e_write(&l3[i], pg ? l3e_from_page(pg, __PAGE_HYPERVISOR_RW) + : l3e_empty()); + } + + l4e_write(&l4[l4_table_offset(PERDOMAIN_VIRT_START)], + l4e_from_mfn(virt_to_mfn(l3), __PAGE_HYPERVISOR_RW)); +} + +int allocate_perdomain_local_l3(unsigned int cpu) +{ + const struct domain *d = idle_vcpu[cpu]->domain; + l3_pgentry_t *l3; + root_pgentry_t *root_pgt = maddr_to_virt(idle_vcpu[cpu]->arch.cr3); + + ASSERT(!per_cpu(local_l3, cpu)); + + if ( !opt_asi_pv && !opt_asi_hvm ) + return 0; + + l3 = alloc_xenheap_page(); + if ( !l3 ) + return -ENOMEM; + + clear_page(l3); + + /* Setup the idle domain slots (current domain) in the L3. */ + populate_perdomain(d, root_pgt, l3); + + per_cpu(local_l3, cpu) = l3; + + return 0; +} + +void free_perdomain_local_l3(unsigned int cpu) +{ + l3_pgentry_t *l3 = per_cpu(local_l3, cpu); + + if ( !l3 ) + return; + + per_cpu(local_l3, cpu) = NULL; + free_xenheap_page(l3); +} + void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt) { - l4e_write(&root_pgt[root_table_offset(PERDOMAIN_VIRT_START)], - l4e_from_page(v->domain->arch.perdomain_l3_pg, - __PAGE_HYPERVISOR_RW)); + const struct domain *d = v->domain; + + if ( d->arch.asi ) + { + l3_pgentry_t *l3 = this_cpu(local_l3); + + ASSERT(l3); + populate_perdomain(d, root_pgt, l3); + } + else if ( is_hvm_domain(d) || d->arch.pv.xpti ) + l4e_write(&root_pgt[root_table_offset(PERDOMAIN_VIRT_START)], + l4e_from_page(v->domain->arch.perdomain_l3_pg, + __PAGE_HYPERVISOR_RW)); if ( !is_pv_64bit_vcpu(v) ) /* diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index c5a13b30daf4..5bf81b81b46f 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1961,6 +1961,14 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) alternative_branches(); + /* + * Setup the local per-domain L3 for the BSP also, so it matches the state + * of the APs. + */ + ret = allocate_perdomain_local_l3(0); + if ( ret ) + panic("Error %d setting up local per-domain L3\n", ret); + /* * NB: when running as a PV shim VCPUOP_up/down is wired to the shim * physical cpu_add/remove functions, so launch the guest with only diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index e07add36b1b6..40cc14799252 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -986,6 +986,7 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove) } cleanup_cpu_root_pgt(cpu); + free_perdomain_local_l3(cpu); if ( per_cpu(stubs.addr, cpu) ) { @@ -1100,6 +1101,9 @@ static int cpu_smpboot_alloc(unsigned int cpu) per_cpu(stubs.addr, cpu) = stub_page + STUB_BUF_CPU_OFFS(cpu); rc = setup_cpu_root_pgt(cpu); + if ( rc ) + goto out; + rc = allocate_perdomain_local_l3(cpu); if ( rc ) goto out; rc = -ENOMEM; From patchwork Fri Jul 26 15:22:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 14A0BC3DA4A for ; Fri, 26 Jul 2024 15:38:53 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765594.1176277 (Exim 4.92) (envelope-from ) id 1sXN1q-0000r1-Hp; Fri, 26 Jul 2024 15:38:46 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765594.1176277; Fri, 26 Jul 2024 15:38:46 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN1q-0000qs-Ev; Fri, 26 Jul 2024 15:38:46 +0000 Received: by outflank-mailman (input) for mailman id 765594; Fri, 26 Jul 2024 15:38:44 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvR-00084Z-Jz for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:09 +0000 Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [2607:f8b0:4864:20::f31]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 35cb5aff-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:32:04 +0200 (CEST) Received: by mail-qv1-xf31.google.com with SMTP id 6a1803df08f44-6b7a4668f07so4077776d6.1 for ; Fri, 26 Jul 2024 08:32:04 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3fb0c4dfsm17369086d6.140.2024.07.26.08.32.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:02 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 35cb5aff-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007923; x=1722612723; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UsXOcQaVUMQAqctB3o7fFiw3e+HWymJDiLOCUlhSRAY=; b=dxuAy7gHDzDH/KVGGpPHLFEyUE9iFudZgosm4OFWPZefJShPK/rq2p7YtytyGlY5QC ZUZfyhqqKX+24y2yv24yP34xMIlfHTq+hOuVRnDjRw+X/lpjA65HiJtOtOjBL4Tmwem7 CumZNIqyMqkTAsmSkxp5Ex9Iqhe5hSe6FStt8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007923; x=1722612723; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UsXOcQaVUMQAqctB3o7fFiw3e+HWymJDiLOCUlhSRAY=; b=ukQL6x9ixnTKAZxk7txarR4WHFDwUiYKrare8Pl66CM/6mZ74g/AL17qNTrXlqZBe9 BSSDzcJebRkSJvuc4tcRCiWC4k0Ap+nyZI2+VAI6j67+nz28kIxl57pdUHPfx0brpSwl iwYAkiOJC/t6DK8f2iVCENraMJ/Tlrf1/gPnIbIS3HTx0psBSnBJy6AW1XluVTYeN7Ba to9wjCsh7ychPSua1gkcodMfQBk84X7FwbBVwbNmpFxykjAVBpv4HIcmtHYp2gJC6NRy JwCcX+1bhgalfLbm1nvOlrsdZHlq8heZsV+aPi0FTXrl52hJNC6ItBwUiZoScNanNWeH EizQ== X-Gm-Message-State: AOJu0Yx9n4M3H8/TsJTqFr8Gk3za3VNyO8S49T29UV1nc/y4hv8HUbMv CVJPZ8/dB5OGiwYoC1tdEsUHdnHPCsu3Ee9GknNIqujMGz90pS8Gz150tGL87F931mD8AiI0M65 q X-Google-Smtp-Source: AGHT+IEcDXOgS0ArbTo0uiVXKSfjiaRnNuncKiibiDiIX5cVaz+XagHjF/5offdpuLLIGr5eCJgaFA== X-Received: by 2002:a05:6214:1d2e:b0:6b7:42aa:3358 with SMTP id 6a1803df08f44-6bb55a7cb75mr1693526d6.31.1722007923280; Fri, 26 Jul 2024 08:32:03 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 17/22] x86/mm: introduce support to populate a per-CPU page-table region Date: Fri, 26 Jul 2024 17:22:01 +0200 Message-ID: <20240726152206.28411-18-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Add logic in map_pages_to_xen() and modify_xen_mappings() so that TLB flushes are only performed locally when dealing with entries in the per-CPU area of the page-tables. No functional change intended, as there are no callers added that create or modify per-CPU mappings, nor is the per-CPU area still properly setup in the page-tables yet. Note that the removed flush_area() ended up calling flush_area_mask() through the flush_area_all() alias. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/config.h | 4 ++ xen/arch/x86/include/asm/flushtlb.h | 1 - xen/arch/x86/mm.c | 64 +++++++++++++++++++---------- 3 files changed, 47 insertions(+), 22 deletions(-) diff --git a/xen/arch/x86/include/asm/config.h b/xen/arch/x86/include/asm/config.h index 2a260a2581fd..c24d735a0cee 100644 --- a/xen/arch/x86/include/asm/config.h +++ b/xen/arch/x86/include/asm/config.h @@ -204,6 +204,10 @@ extern unsigned char boot_edid_info[128]; #define PERDOMAIN_SLOTS 3 #define PERDOMAIN_VIRT_SLOT(s) (PERDOMAIN_VIRT_START + (s) * \ (PERDOMAIN_SLOT_MBYTES << 20)) +#define PERCPU_VIRT_START PERDOMAIN_VIRT_SLOT(PERDOMAIN_SLOTS) +#define PERCPU_SLOTS 1 +#define PERCPU_VIRT_SLOT(s) (PERCPU_VIRT_START + (s) * \ + (PERDOMAIN_SLOT_MBYTES << 20)) /* Slot 4: mirror of per-domain mappings (for compat xlat area accesses). */ #define PERDOMAIN_ALT_VIRT_START PML4_ADDR(4) /* Slot 261: machine-to-phys conversion table (256GB). */ diff --git a/xen/arch/x86/include/asm/flushtlb.h b/xen/arch/x86/include/asm/flushtlb.h index 1b98d03decdc..affe944d1a5b 100644 --- a/xen/arch/x86/include/asm/flushtlb.h +++ b/xen/arch/x86/include/asm/flushtlb.h @@ -146,7 +146,6 @@ void flush_area_mask(const cpumask_t *mask, const void *va, #define flush_mask(mask, flags) flush_area_mask(mask, NULL, flags) /* Flush all CPUs' TLBs/caches */ -#define flush_area_all(va, flags) flush_area_mask(&cpu_online_map, va, flags) #define flush_all(flags) flush_mask(&cpu_online_map, flags) /* Flush local TLBs */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 1367f3361ffe..c468b46a9d1b 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5023,9 +5023,13 @@ static DEFINE_SPINLOCK(map_pgdir_lock); */ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) { + unsigned int cpu = smp_processor_id(); + /* Called before idle_vcpu is populated, fallback to idle_pg_table. */ + root_pgentry_t *root_pgt = idle_vcpu[cpu] ? + maddr_to_virt(idle_vcpu[cpu]->arch.cr3) : idle_pg_table; l4_pgentry_t *pl4e; - pl4e = &idle_pg_table[l4_table_offset(v)]; + pl4e = &root_pgt[l4_table_offset(v)]; if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) ) { bool locking = system_state > SYS_STATE_boot; @@ -5138,8 +5142,8 @@ static l1_pgentry_t *virt_to_xen_l1e(unsigned long v) #define l1f_to_lNf(f) (((f) & _PAGE_PRESENT) ? ((f) | _PAGE_PSE) : (f)) #define lNf_to_l1f(f) (((f) & _PAGE_PRESENT) ? ((f) & ~_PAGE_PSE) : (f)) -/* flush_area_all() can be used prior to any other CPU being online. */ -#define flush_area(v, f) flush_area_all((const void *)(v), f) +/* flush_area_mask() can be used prior to any other CPU being online. */ +#define flush_area_mask(m, v, f) flush_area_mask(m, (const void *)(v), f) #define L3T_INIT(page) (page) = ZERO_BLOCK_PTR @@ -5222,7 +5226,11 @@ int map_pages_to_xen( unsigned long nr_mfns, unsigned int flags) { - bool locking = system_state > SYS_STATE_boot; + bool global = virt < PERCPU_VIRT_START || + virt >= PERCPU_VIRT_SLOT(PERCPU_SLOTS); + bool locking = system_state > SYS_STATE_boot && global; + const cpumask_t *flush_mask = global ? &cpu_online_map + : cpumask_of(smp_processor_id()); l3_pgentry_t *pl3e = NULL, ol3e; l2_pgentry_t *pl2e = NULL, ol2e; l1_pgentry_t *pl1e, ol1e; @@ -5244,6 +5252,11 @@ int map_pages_to_xen( } \ } while (0) + /* Ensure it's a global mapping or it's only modifying the per-CPU area. */ + ASSERT(global || + (virt + nr_mfns * PAGE_SIZE >= PERCPU_VIRT_START && + virt + nr_mfns * PAGE_SIZE < PERCPU_VIRT_SLOT(PERCPU_SLOTS))); + L3T_INIT(current_l3page); while ( nr_mfns != 0 ) @@ -5278,7 +5291,7 @@ int map_pages_to_xen( if ( l3e_get_flags(ol3e) & _PAGE_PSE ) { flush_flags(lNf_to_l1f(l3e_get_flags(ol3e))); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); } else { @@ -5301,7 +5314,7 @@ int map_pages_to_xen( unmap_domain_page(l1t); } } - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) { ol2e = l2t[i]; @@ -5373,7 +5386,7 @@ int map_pages_to_xen( } if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); free_xen_pagetable(l2mfn); } @@ -5399,7 +5412,7 @@ int map_pages_to_xen( if ( l2e_get_flags(ol2e) & _PAGE_PSE ) { flush_flags(lNf_to_l1f(l2e_get_flags(ol2e))); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); } else { @@ -5407,7 +5420,7 @@ int map_pages_to_xen( for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ ) flush_flags(l1e_get_flags(l1t[i])); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); unmap_domain_page(l1t); free_xen_pagetable(l2e_get_mfn(ol2e)); } @@ -5476,7 +5489,7 @@ int map_pages_to_xen( } if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); free_xen_pagetable(l1mfn); } @@ -5491,7 +5504,7 @@ int map_pages_to_xen( unsigned int flush_flags = FLUSH_TLB | FLUSH_ORDER(0); flush_flags(l1e_get_flags(ol1e)); - flush_area(virt, flush_flags); + flush_area_mask(flush_mask, virt, flush_flags); } virt += 1UL << L1_PAGETABLE_SHIFT; @@ -5540,9 +5553,9 @@ int map_pages_to_xen( l2e_write(pl2e, l2e_from_pfn(base_mfn, l1f_to_lNf(flags))); if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(virt - PAGE_SIZE, - FLUSH_TLB_GLOBAL | - FLUSH_ORDER(PAGETABLE_ORDER)); + flush_area_mask(flush_mask, virt - PAGE_SIZE, + FLUSH_TLB_GLOBAL | + FLUSH_ORDER(PAGETABLE_ORDER)); free_xen_pagetable(l2e_get_mfn(ol2e)); } else if ( locking ) @@ -5589,9 +5602,9 @@ int map_pages_to_xen( l3e_write(pl3e, l3e_from_pfn(base_mfn, l1f_to_lNf(flags))); if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(virt - PAGE_SIZE, - FLUSH_TLB_GLOBAL | - FLUSH_ORDER(2*PAGETABLE_ORDER)); + flush_area_mask(flush_mask, virt - PAGE_SIZE, + FLUSH_TLB_GLOBAL | + FLUSH_ORDER(2*PAGETABLE_ORDER)); free_xen_pagetable(l3e_get_mfn(ol3e)); } else if ( locking ) @@ -5629,7 +5642,11 @@ int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns) */ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) { - bool locking = system_state > SYS_STATE_boot; + bool global = s < PERCPU_VIRT_START || + s >= PERCPU_VIRT_SLOT(PERCPU_SLOTS); + bool locking = system_state > SYS_STATE_boot && global; + const cpumask_t *flush_mask = global ? &cpu_online_map + : cpumask_of(smp_processor_id()); l3_pgentry_t *pl3e = NULL; l2_pgentry_t *pl2e = NULL; l1_pgentry_t *pl1e; @@ -5638,6 +5655,9 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) int rc = -ENOMEM; struct page_info *current_l3page; + ASSERT(global || + (e >= PERCPU_VIRT_START && e < PERCPU_VIRT_SLOT(PERCPU_SLOTS))); + /* Set of valid PTE bits which may be altered. */ #define FLAGS_MASK (_PAGE_NX|_PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_RW|_PAGE_PRESENT) nf &= FLAGS_MASK; @@ -5836,7 +5856,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) l2e_write(pl2e, l2e_empty()); if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ + /* flush before free */ + flush_area_mask(flush_mask, NULL, FLUSH_TLB_GLOBAL); free_xen_pagetable(l1mfn); } else if ( locking ) @@ -5880,7 +5901,8 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) l3e_write(pl3e, l3e_empty()); if ( locking ) spin_unlock(&map_pgdir_lock); - flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ + /* flush before free */ + flush_area_mask(flush_mask, NULL, FLUSH_TLB_GLOBAL); free_xen_pagetable(l2mfn); } else if ( locking ) @@ -5888,7 +5910,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) } } - flush_area(NULL, FLUSH_TLB_GLOBAL); + flush_area_mask(flush_mask, NULL, FLUSH_TLB_GLOBAL); #undef FLAGS_MASK rc = 0; From patchwork Fri Jul 26 15:22:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB26FC3DA4A for ; Fri, 26 Jul 2024 15:40:37 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765621.1176306 (Exim 4.92) (envelope-from ) id 1sXN3V-0003y5-Ee; Fri, 26 Jul 2024 15:40:29 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765621.1176306; Fri, 26 Jul 2024 15:40:29 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN3V-0003xy-Bd; Fri, 26 Jul 2024 15:40:29 +0000 Received: by outflank-mailman (input) for mailman id 765621; Fri, 26 Jul 2024 15:40:27 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvR-00084T-28 for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:09 +0000 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [2607:f8b0:4864:20::735]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 37395fc1-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:32:07 +0200 (CEST) Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-7a1dac7f0b7so50831785a.0 for ; Fri, 26 Jul 2024 08:32:07 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb3fafa1aesm17379126d6.133.2024.07.26.08.32.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:05 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 37395fc1-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007926; x=1722612726; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6N5uFSzBTQXjVPjBVWB2aNudK8NA3i5SjmiiCJAOxYs=; b=oJUF7bB8n4OBULxWBwN4LGtJjxxsUveUOVM5DApoLnlJ0tupXEOQul0zcUkoYpXNag sUH8FD9yt1KMl91ZLSeM81qjLj4Zyp659UgobctyLBsOEPclbbDVJbt7YUD3Jif4jd/l yyuKqTddDN5KyeA84f+Xr6Mgt+rBn0BmprsTQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007926; x=1722612726; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6N5uFSzBTQXjVPjBVWB2aNudK8NA3i5SjmiiCJAOxYs=; b=TlxDlqg3UMXwRMe++gkspbAJ2jZUq0Rh7WlwhpeLZeKQA40XA8ltlgesTQvyb7f0hS 8b8rle7xti14p1/7vvZFdQOHVhT0XEix+K1gPmUJ51ChRFZdhK/2+HPxEEKq/6zLP6qe zkt5q0Rd8nXYoKFvozRb8oXXm5sEtG0XCeAchFx838T6yQ8jydvHKbV8uYlF5w3KUJ1N Ax0v0kdudiOjYm9mbr4qGHqwTsASQBPT/0v5uJDEqKAh5mxfb8QBuJ3PEKm1UMvsPVKb CYBnOpUsag6jbfJrYOgz0ypYttBjQtJDjtvYYW7Yscjm/eQThzxY5COPHv4hSYUTIQRh w/4A== X-Gm-Message-State: AOJu0Yw5F8GIS+TR+VVWnffh+oJeKBSNJjsv7nSHItcVDp3GL5xfvCDS 8fWuZ8Wy+g2DKKXqHp05clNDKCI7QCed3PnYX6UtxSpTO7RlLkVvpjA40PW643UE6VEHWJfWVrh T X-Google-Smtp-Source: AGHT+IF/3NRsMDjUgIjYnbPd2fhW6tsx++V1EFbAhEJKBkE/No/KxeQ0CEuKp6/ZZy6Hgd5ovaiF9g== X-Received: by 2002:a05:6214:1247:b0:6b4:dd2a:aa44 with SMTP id 6a1803df08f44-6bb55a83e51mr2216066d6.37.1722007925575; Fri, 26 Jul 2024 08:32:05 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 18/22] x86/mm: allow modifying per-CPU entries of remote page-tables Date: Fri, 26 Jul 2024 17:22:02 +0200 Message-ID: <20240726152206.28411-19-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Add support for modifying the per-CPU page-tables entries of remote CPUs, this will be required in order to setup the page-tables of CPUs before bringing them up. A restriction is added so that remote page-tables can only be modified as long as the remote CPU is not yet online. Non functional change, as there's no user introduced that modifies remote page-tables. Signed-off-by: Roger Pau Monné --- Can be merged with previous patch? --- xen/arch/x86/include/asm/mm.h | 15 ++++++++++ xen/arch/x86/mm.c | 55 ++++++++++++++++++++++++++--------- 2 files changed, 56 insertions(+), 14 deletions(-) diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 34407fb0af06..f883468b1a7c 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -637,4 +637,19 @@ void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt); int allocate_perdomain_local_l3(unsigned int cpu); void free_perdomain_local_l3(unsigned int cpu); +/* Specify the CPU idle root page-table to use for modifications. */ +int map_pages_to_xen_cpu( + unsigned long virt, + mfn_t mfn, + unsigned long nr_mfns, + unsigned int flags, + unsigned int cpu); +int modify_xen_mappings_cpu(unsigned long s, unsigned long e, unsigned int nf, + unsigned int cpu); +static inline int destroy_xen_mappings_cpu(unsigned long s, unsigned long e, + unsigned int cpu) +{ + return modify_xen_mappings_cpu(s, e, _PAGE_NONE, cpu); +} + #endif /* __ASM_X86_MM_H__ */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index c468b46a9d1b..faf2d42745d1 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5021,9 +5021,8 @@ static DEFINE_SPINLOCK(map_pgdir_lock); * For virt_to_xen_lXe() functions, they take a linear address and return a * pointer to Xen's LX entry. Caller needs to unmap the pointer. */ -static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) +static l3_pgentry_t *virt_to_xen_l3e_cpu(unsigned long v, unsigned int cpu) { - unsigned int cpu = smp_processor_id(); /* Called before idle_vcpu is populated, fallback to idle_pg_table. */ root_pgentry_t *root_pgt = idle_vcpu[cpu] ? maddr_to_virt(idle_vcpu[cpu]->arch.cr3) : idle_pg_table; @@ -5062,11 +5061,16 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) return map_l3t_from_l4e(*pl4e) + l3_table_offset(v); } -static l2_pgentry_t *virt_to_xen_l2e(unsigned long v) +static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) +{ + return virt_to_xen_l3e_cpu(v, smp_processor_id()); +} + +static l2_pgentry_t *virt_to_xen_l2e_cpu(unsigned long v, unsigned int cpu) { l3_pgentry_t *pl3e, l3e; - pl3e = virt_to_xen_l3e(v); + pl3e = virt_to_xen_l3e_cpu(v, cpu); if ( !pl3e ) return NULL; @@ -5100,11 +5104,11 @@ static l2_pgentry_t *virt_to_xen_l2e(unsigned long v) return map_l2t_from_l3e(l3e) + l2_table_offset(v); } -static l1_pgentry_t *virt_to_xen_l1e(unsigned long v) +static l1_pgentry_t *virt_to_xen_l1e_cpu(unsigned long v, unsigned int cpu) { l2_pgentry_t *pl2e, l2e; - pl2e = virt_to_xen_l2e(v); + pl2e = virt_to_xen_l2e_cpu(v, cpu); if ( !pl2e ) return NULL; @@ -5220,17 +5224,18 @@ mfn_t xen_map_to_mfn(unsigned long va) return ret; } -int map_pages_to_xen( +int map_pages_to_xen_cpu( unsigned long virt, mfn_t mfn, unsigned long nr_mfns, - unsigned int flags) + unsigned int flags, + unsigned int cpu) { bool global = virt < PERCPU_VIRT_START || virt >= PERCPU_VIRT_SLOT(PERCPU_SLOTS); bool locking = system_state > SYS_STATE_boot && global; const cpumask_t *flush_mask = global ? &cpu_online_map - : cpumask_of(smp_processor_id()); + : cpumask_of(cpu); l3_pgentry_t *pl3e = NULL, ol3e; l2_pgentry_t *pl2e = NULL, ol2e; l1_pgentry_t *pl1e, ol1e; @@ -5257,6 +5262,9 @@ int map_pages_to_xen( (virt + nr_mfns * PAGE_SIZE >= PERCPU_VIRT_START && virt + nr_mfns * PAGE_SIZE < PERCPU_VIRT_SLOT(PERCPU_SLOTS))); + /* Only allow modifying remote page-tables if the CPU is not online. */ + ASSERT(cpu == smp_processor_id() || !cpu_online(cpu)); + L3T_INIT(current_l3page); while ( nr_mfns != 0 ) @@ -5266,7 +5274,7 @@ int map_pages_to_xen( UNMAP_DOMAIN_PAGE(pl3e); UNMAP_DOMAIN_PAGE(pl2e); - pl3e = virt_to_xen_l3e(virt); + pl3e = virt_to_xen_l3e_cpu(virt, cpu); if ( !pl3e ) goto out; @@ -5391,7 +5399,7 @@ int map_pages_to_xen( free_xen_pagetable(l2mfn); } - pl2e = virt_to_xen_l2e(virt); + pl2e = virt_to_xen_l2e_cpu(virt, cpu); if ( !pl2e ) goto out; @@ -5437,7 +5445,7 @@ int map_pages_to_xen( /* Normal page mapping. */ if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) { - pl1e = virt_to_xen_l1e(virt); + pl1e = virt_to_xen_l1e_cpu(virt, cpu); if ( pl1e == NULL ) goto out; } @@ -5623,6 +5631,16 @@ int map_pages_to_xen( return rc; } +int map_pages_to_xen( + unsigned long virt, + mfn_t mfn, + unsigned long nr_mfns, + unsigned int flags) +{ + return map_pages_to_xen_cpu(virt, mfn, nr_mfns, flags, smp_processor_id()); +} + + int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns) { return map_pages_to_xen(virt, INVALID_MFN, nr_mfns, MAP_SMALL_PAGES); @@ -5640,7 +5658,8 @@ int __init populate_pt_range(unsigned long virt, unsigned long nr_mfns) * * It is an error to call with present flags over an unpopulated range. */ -int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) +int modify_xen_mappings_cpu(unsigned long s, unsigned long e, unsigned int nf, + unsigned int cpu) { bool global = s < PERCPU_VIRT_START || s >= PERCPU_VIRT_SLOT(PERCPU_SLOTS); @@ -5658,6 +5677,9 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) ASSERT(global || (e >= PERCPU_VIRT_START && e < PERCPU_VIRT_SLOT(PERCPU_SLOTS))); + /* Only allow modifying remote page-tables if the CPU is not online. */ + ASSERT(cpu == smp_processor_id() || !cpu_online(cpu)); + /* Set of valid PTE bits which may be altered. */ #define FLAGS_MASK (_PAGE_NX|_PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_RW|_PAGE_PRESENT) nf &= FLAGS_MASK; @@ -5674,7 +5696,7 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) UNMAP_DOMAIN_PAGE(pl2e); UNMAP_DOMAIN_PAGE(pl3e); - pl3e = virt_to_xen_l3e(v); + pl3e = virt_to_xen_l3e_cpu(v, cpu); if ( !pl3e ) goto out; @@ -5927,6 +5949,11 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) #undef flush_area +int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) +{ + return modify_xen_mappings_cpu(s, e, nf, smp_processor_id()); +} + int destroy_xen_mappings(unsigned long s, unsigned long e) { return modify_xen_mappings(s, e, _PAGE_NONE); From patchwork Fri Jul 26 15:22:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EE172C3DA49 for ; Fri, 26 Jul 2024 15:39:37 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765610.1176298 (Exim 4.92) (envelope-from ) id 1sXN2a-000279-6e; Fri, 26 Jul 2024 15:39:32 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765610.1176298; Fri, 26 Jul 2024 15:39:32 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN2a-000272-1l; Fri, 26 Jul 2024 15:39:32 +0000 Received: by outflank-mailman (input) for mailman id 765610; Fri, 26 Jul 2024 15:39:31 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvW-00084Z-La for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:14 +0000 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [2607:f8b0:4864:20::82d]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 388cccde-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:32:09 +0200 (CEST) Received: by mail-qt1-x82d.google.com with SMTP id d75a77b69052e-44fdde0c8dcso4458681cf.0 for ; Fri, 26 Jul 2024 08:32:09 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe812350bsm14403271cf.9.2024.07.26.08.32.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:07 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 388cccde-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007928; x=1722612728; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BinizVTi7GOPnEVhKIj7EHarU0kRGEbeujfs570yXU4=; b=TYuQCLcln+oAe6OXvifAcjCJ7+RsJ4/Re+h1oCnelvIkpBoXY4jFGy+VPVSBGdFQGs PvwBAO77i1pSOYknHCEqI4eRpPN6TpfU8JcL2vLsFlX0bB181TX0rlNZ8ePRzUM9jfSc PoSJlx8EcxS/3eiZYPSSUplujenlp12zbLT7k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007928; x=1722612728; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BinizVTi7GOPnEVhKIj7EHarU0kRGEbeujfs570yXU4=; b=QZb3NezfhxOpso6ImJOKKsG+LpkJSEdafCx5W9i8K3FDQIyBa5aT8OaC1y4xWjTRy4 dvutwceBZIl8JGSzKZu06O0yPDdPQvybqcgAtREZblBg0A/S02fvgKrcUwi6Fw8nzDm2 2zMjJcE7d5NJd6IURUoLzX0XT8wEDUW4hUqlnRyhbnLdPYvneAEFK5zbGkpkBLYay5MW ZLi+7A8lW4QLM/tW+PBNhu1JkfkDzw3mQ6w65gN+yskdJTXVhmvzhc5toNZG5gkylsN0 TqSeky9NND4+fEmNEcoh2D0vZCQpUMCdCLXXGcv8n1B8xqawAE93UPVFkJM4mO+3z9Jh cquA== X-Gm-Message-State: AOJu0YzPAP7gwycglyQGUgVDn8X1B7fMelDdL5vCDhThTo+vdvs4eWDF dJ7M/r8QXA/vVRfQqRTfHzG6BAtX24yVvau3p3egetod+f8IaJhjXZzbjREszDAU7u455nnOcPY 3 X-Google-Smtp-Source: AGHT+IH7HhTgM9wzAwV7qVf5Dw7zzApRYlbHlcznF51xEGNl5V19OCzC2QVC77wObVn2OfyDnPT0Ug== X-Received: by 2002:a05:622a:547:b0:44f:f7be:4d3e with SMTP id d75a77b69052e-45004d6fd41mr2330571cf.5.1722007927942; Fri, 26 Jul 2024 08:32:07 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 19/22] x86/mm: introduce a per-CPU fixmap area Date: Fri, 26 Jul 2024 17:22:03 +0200 Message-ID: <20240726152206.28411-20-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 Introduce the logic to manage a per-CPU fixmap area. This includes adding a new set of headers that are capable of creating mappings in the per-CPU page-table regions by making use of the map_pages_to_xen_cpu(). This per-CPU fixmap area is currently set to use one L3 slot: 1GiB of linear address space. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/fixmap.h | 44 +++++++++++++++++++++++++++++++ xen/arch/x86/mm.c | 16 ++++++++++- 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/include/asm/fixmap.h b/xen/arch/x86/include/asm/fixmap.h index 516ec3fa6c95..a456c65072d8 100644 --- a/xen/arch/x86/include/asm/fixmap.h +++ b/xen/arch/x86/include/asm/fixmap.h @@ -118,6 +118,50 @@ extern void __set_fixmap_x( #define __fix_x_to_virt(x) (FIXADDR_X_TOP - ((x) << PAGE_SHIFT)) #define fix_x_to_virt(x) ((void *)__fix_x_to_virt(x)) +/* per-CPU fixmap area. */ +enum percpu_fixed_addresses { + __end_of_percpu_fixed_addresses +}; + +#define PERCPU_FIXADDR_SIZE (__end_of_percpu_fixed_addresses << PAGE_SHIFT) +#define PERCPU_FIXADDR PERCPU_VIRT_SLOT(0) + +static inline void *percpu_fix_to_virt(enum percpu_fixed_addresses idx) +{ + BUG_ON(idx >=__end_of_percpu_fixed_addresses); + return (void *)PERCPU_FIXADDR + (idx << PAGE_SHIFT); +} + +static inline void percpu_set_fixmap_remote( + unsigned int cpu, enum percpu_fixed_addresses idx, mfn_t mfn, + unsigned long flags) +{ + map_pages_to_xen_cpu((unsigned long)percpu_fix_to_virt(idx), mfn, 1, flags, + cpu); +} + +static inline void percpu_clear_fixmap_remote( + unsigned int cpu, enum percpu_fixed_addresses idx) +{ + /* + * Use map_pages_to_xen_cpu() instead of destroy_xen_mappings_cpu() to + * avoid tearing down the intermediate page-tables if empty. + */ + map_pages_to_xen_cpu((unsigned long)percpu_fix_to_virt(idx), INVALID_MFN, 1, + 0, cpu); +} + +static inline void percpu_set_fixmap(enum percpu_fixed_addresses idx, mfn_t mfn, + unsigned long flags) +{ + percpu_set_fixmap_remote(smp_processor_id(), idx, mfn, flags); +} + +static inline void percpu_clear_fixmap(enum percpu_fixed_addresses idx) +{ + percpu_clear_fixmap_remote(smp_processor_id(), idx); +} + #endif /* __ASSEMBLY__ */ #endif diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index faf2d42745d1..937089d203cc 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6467,7 +6467,17 @@ int allocate_perdomain_local_l3(unsigned int cpu) per_cpu(local_l3, cpu) = l3; - return 0; + /* + * Pre-allocate the page-table structures for the per-cpu fixmap. Some of + * the per-cpu fixmap calls might happen in contexts where memory + * allocation is not possible. + * + * Only one L3 slot is currently reserved for the per-CPU fixmap. + */ + BUILD_BUG_ON(PERCPU_FIXADDR_SIZE > (1 << L3_PAGETABLE_SHIFT)); + return map_pages_to_xen_cpu(PERCPU_VIRT_START, INVALID_MFN, + PFN_DOWN(PERCPU_FIXADDR_SIZE), MAP_SMALL_PAGES, + cpu); } void free_perdomain_local_l3(unsigned int cpu) @@ -6478,6 +6488,10 @@ void free_perdomain_local_l3(unsigned int cpu) return; per_cpu(local_l3, cpu) = NULL; + + destroy_xen_mappings_cpu(PERCPU_VIRT_START, + PERCPU_VIRT_START + PERCPU_FIXADDR_SIZE, cpu); + free_xenheap_page(l3); } From patchwork Fri Jul 26 15:22:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 28F84C3DA4A for ; Fri, 26 Jul 2024 15:39:21 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765605.1176286 (Exim 4.92) (envelope-from ) id 1sXN2I-0001dg-Od; Fri, 26 Jul 2024 15:39:14 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765605.1176286; Fri, 26 Jul 2024 15:39:14 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN2I-0001dZ-M7; Fri, 26 Jul 2024 15:39:14 +0000 Received: by outflank-mailman (input) for mailman id 765605; Fri, 26 Jul 2024 15:39:13 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvW-00084T-VO for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:15 +0000 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [2607:f8b0:4864:20::82e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 3a956b84-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:32:13 +0200 (CEST) Received: by mail-qt1-x82e.google.com with SMTP id d75a77b69052e-44fe58fcf29so3847531cf.2 for ; Fri, 26 Jul 2024 08:32:12 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe81259cesm14414941cf.6.2024.07.26.08.32.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:09 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3a956b84-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007931; x=1722612731; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D/Si1GObgr7VHk+37CZa7YHHAXjnb60+qTIn0xcWOxk=; b=kvGLK3cnHBOWVPyLL2a5LWNAGH+IdS0PqP3iwoNcbT6QccCG9Vk/FaOHrdLhd3/MKc dSw+1gjN6EFBEeD6JMmsEX8TUtLoAagtbO4akI9YTzxFqawG9GhG2/dHhddSJKW6uKWF Y8qEVNx1T3TP6rx0HwqaPQz7SOIQu4FID+fQE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007931; x=1722612731; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D/Si1GObgr7VHk+37CZa7YHHAXjnb60+qTIn0xcWOxk=; b=uwMQpRUpNf+DscGt3s4mol6LnWsFis3Zd2ueXE5m50WvUsKJbwCe3TZna6oTWboplb vrW0RkzerbuSCMf8B7R2X+f4a8rtEN74rI3JwJo4aPjkSOcWjU06IdpIAXtcwYQTCDGv wVYkHSsT87rKnnpNCgEGrehpU9LcPJmfHy5EnE84zE01F8cT5+Rn3aklccaoriK6P55Z xffCEP8XccVrJAl+uk5HGk8rNXuYCHRmGSqvIPUL096ejROgvpz3OPJKZxm54Qb+7S4H WA4tj/yGNYT01prVNM90/U/TlkF7C3di92Ji4gNu6OcoMK+4ZhvBw7cv1OnRAO5hY819 AHIw== X-Gm-Message-State: AOJu0YzGQfVAPg4ZZdzZa01cf24MVSeJAV4M+POvrPhD9mOdw7kvpnuA 9cwBuDMJ13EfMr29nwJAx0CpgxuoosQq+5f9DDOtx/X+aykkIKPf7StCVVJZQaUvsm3qTGDGyzk 9 X-Google-Smtp-Source: AGHT+IFlmoKIVbvsxd/sjDsiPGUXG2bxnsKNLp/iKOSkxDcooIG1tLdj9v2mDDAsKk1LuEAAGdEnhA== X-Received: by 2002:a05:622a:13d1:b0:447:e83a:1051 with SMTP id d75a77b69052e-45004f3dd66mr1394031cf.47.1722007930041; Fri, 26 Jul 2024 08:32:10 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 20/22] x86/pv: allow using a unique per-pCPU root page table (L4) Date: Fri, 26 Jul 2024 17:22:04 +0200 Message-ID: <20240726152206.28411-21-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 When running PV guests it's possible for the guest to use the same root page table (L4) for all vCPUs, which in turn will result in Xen also using the same root page table on all pCPUs that are running any domain vCPU. When using XPTI Xen switches to a per-CPU shadow L4 when running in guest context, switching to the fully populated L4 when in Xen context. Take advantage of this existing shadowing and force the usage of a per-CPU L4 that shadows the guest selected L4 when Address Space Isolation is requested for PV guests. The mapping of the guest L4 is done with a per-CPU fixmap entry, that however requires that the currently loaded L4 has the per-CPU slot setup. In order to ensure this switch to the shadow per-CPU L4 with just the Xen slots populated, and then map the guest L4 and copy the contents of the guest controlled slots. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain.c | 37 +++++++++++++++++++++ xen/arch/x86/flushtlb.c | 9 ++++++ xen/arch/x86/include/asm/current.h | 15 ++++++--- xen/arch/x86/include/asm/fixmap.h | 1 + xen/arch/x86/include/asm/pv/mm.h | 8 +++++ xen/arch/x86/mm.c | 47 +++++++++++++++++++++++++++ xen/arch/x86/pv/domain.c | 25 ++++++++++++-- xen/arch/x86/pv/mm.c | 52 ++++++++++++++++++++++++++++++ xen/arch/x86/smpboot.c | 20 +++++++++++- 9 files changed, 207 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index b62c4311da6c..94a42ef29cd1 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include #include @@ -2110,11 +2111,47 @@ void context_switch(struct vcpu *prev, struct vcpu *next) local_irq_disable(); + if ( is_pv_domain(prevd) && prevd->arch.asi ) + { + /* + * Don't leak the L4 shadow mapping in the per-CPU area. Can't be done + * in paravirt_ctxt_switch_from() because the lazy idle vCPU context + * switch would otherwise enter an infinite loop in + * mapcache_current_vcpu() with sync_local_execstate(). + * + * Note clearing the fixmpa must strictly be done ahead of changing the + * current vCPU and with interrupts disabled, so there's no window + * where current->domain->arch.asi == true and PCPU_FIX_PV_L4SHADOW is + * not mapped. + */ + percpu_clear_fixmap(PCPU_FIX_PV_L4SHADOW); + get_cpu_info()->root_pgt_changed = false; + } + set_current(next); if ( (per_cpu(curr_vcpu, cpu) == next) || (is_idle_domain(nextd) && cpu_online(cpu)) ) { + if ( is_pv_domain(nextd) && nextd->arch.asi ) + { + /* Signal the fixmap entry must be mapped. */ + get_cpu_info()->new_cr3 = true; + if ( get_cpu_info()->root_pgt_changed ) + { + /* + * Map and update the shadow L4 in case we received any + * FLUSH_ROOT_PGTBL request while running on the idle vCPU. + * + * Do it before enabling interrupts so that no flush IPI can be + * delivered without having PCPU_FIX_PV_L4SHADOW correctly + * mapped. + */ + pv_update_shadow_l4(next, true); + get_cpu_info()->root_pgt_changed = false; + } + } + local_irq_enable(); } else diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index fd5ed16ffb57..b85ce232abbb 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -17,6 +17,7 @@ #include #include #include +#include #include /* Debug builds: Wrap frequently to stress-test the wrap logic. */ @@ -192,7 +193,15 @@ unsigned int flush_area_local(const void *va, unsigned int flags) unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; if ( flags & FLUSH_ROOT_PGTBL ) + { + const struct vcpu *curr = current; + const struct domain *curr_d = curr->domain; + get_cpu_info()->root_pgt_changed = true; + if ( is_pv_domain(curr_d) && curr_d->arch.asi ) + /* Update the shadow root page-table ahead of doing TLB flush. */ + pv_update_shadow_l4(curr, false); + } if ( flags & (FLUSH_TLB|FLUSH_TLB_GLOBAL) ) { diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index bcec328c9875..6a021607a1a9 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -60,10 +60,14 @@ struct cpu_info { uint8_t scf; /* SCF_* */ /* - * The following field controls copying of the L4 page table of 64-bit - * PV guests to the per-cpu root page table on entering the guest context. - * If set the L4 page table is being copied to the root page table and - * the field will be reset. + * For XPTI the following field controls copying of the L4 page table of + * 64-bit PV guests to the per-cpu root page table on entering the guest + * context. If set the L4 page table is being copied to the root page + * table and the field will be reset. + * + * For ASI the field is used to acknowledge whether a FLUSH_ROOT_PGTBL + * request has been received when running the idle vCPU on PV guest + * page-tables (a lazy context switch to the idle vCPU). */ bool root_pgt_changed; @@ -74,6 +78,9 @@ struct cpu_info { */ bool use_pv_cr3; + /* For ASI: per-CPU fixmap of guest L4 is possibly out of sync. */ + bool new_cr3; + /* get_stack_bottom() must be 16-byte aligned */ }; diff --git a/xen/arch/x86/include/asm/fixmap.h b/xen/arch/x86/include/asm/fixmap.h index a456c65072d8..bc68a98568ae 100644 --- a/xen/arch/x86/include/asm/fixmap.h +++ b/xen/arch/x86/include/asm/fixmap.h @@ -120,6 +120,7 @@ extern void __set_fixmap_x( /* per-CPU fixmap area. */ enum percpu_fixed_addresses { + PCPU_FIX_PV_L4SHADOW, __end_of_percpu_fixed_addresses }; diff --git a/xen/arch/x86/include/asm/pv/mm.h b/xen/arch/x86/include/asm/pv/mm.h index 182764542c1f..a7c74898fce0 100644 --- a/xen/arch/x86/include/asm/pv/mm.h +++ b/xen/arch/x86/include/asm/pv/mm.h @@ -23,6 +23,9 @@ bool pv_destroy_ldt(struct vcpu *v); int validate_segdesc_page(struct page_info *page); +void pv_clear_l4_guest_entries(root_pgentry_t *root_pgt); +void pv_update_shadow_l4(const struct vcpu *v, bool flush); + #else #include @@ -44,6 +47,11 @@ static inline bool pv_map_ldt_shadow_page(unsigned int off) { return false; } static inline bool pv_destroy_ldt(struct vcpu *v) { ASSERT_UNREACHABLE(); return false; } +static inline void pv_clear_l4_guest_entries(root_pgentry_t *root_pgt) +{ ASSERT_UNREACHABLE(); } +static inline void pv_update_shadow_l4(const struct vcpu *v, bool flush) +{ ASSERT_UNREACHABLE(); } + #endif #endif /* __X86_PV_MM_H__ */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 937089d203cc..8fea7465a9df 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -513,6 +513,8 @@ void make_cr3(struct vcpu *v, mfn_t mfn) v->arch.cr3 = mfn_x(mfn) << PAGE_SHIFT; if ( is_pv_domain(d) && d->arch.pv.pcid ) v->arch.cr3 |= get_pcid_bits(v, false); + if ( is_pv_domain(d) && d->arch.asi ) + get_cpu_info()->new_cr3 = true; } void write_ptbase(struct vcpu *v) @@ -532,6 +534,40 @@ void write_ptbase(struct vcpu *v) cpu_info->pv_cr3 |= get_pcid_bits(v, true); switch_cr3_cr4(v->arch.cr3, new_cr4); } + else if ( is_pv_domain(d) && d->arch.asi ) + { + root_pgentry_t *root_pgt = this_cpu(root_pgt); + unsigned long cr3 = __pa(root_pgt); + + /* + * XPTI and ASI cannot be simultaneously used even by different + * domains at runtime. + */ + ASSERT(!cpu_info->use_pv_cr3 && !cpu_info->xen_cr3 && + !cpu_info->pv_cr3); + + if ( new_cr4 & X86_CR4_PCIDE ) + cr3 |= get_pcid_bits(v, false); + + /* + * Zap guest L4 entries ahead of flushing the TLB, so that the CPU + * cannot speculatively populate the TLB with stale mappings. + */ + pv_clear_l4_guest_entries(root_pgt); + + /* + * Switch to the shadow L4 with just the Xen slots populated, the guest + * slots will be populated by pv_update_shadow_l4() once running on the + * shadow L4. + * + * The reason for switching to the per-CPU shadow L4 before updating + * the guest slots is that pv_update_shadow_l4() uses per-CPU mappings, + * and the in-use page-table previous to the switch_cr3_cr4() call + * might not support per-CPU mappings. + */ + switch_cr3_cr4(cr3, new_cr4); + pv_update_shadow_l4(v, false); + } else { ASSERT(!is_hvm_domain(d) || !d->arch.asi @@ -6505,6 +6541,17 @@ void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt) ASSERT(l3); populate_perdomain(d, root_pgt, l3); + + if ( is_pv_domain(d) ) + { + /* + * Abuse the fact that this function is called on vCPU context + * switch and clean previous guest controlled slots from the shadow + * L4. + */ + pv_clear_l4_guest_entries(root_pgt); + get_cpu_info()->new_cr3 = true; + } } else if ( is_hvm_domain(d) || d->arch.pv.xpti ) l4e_write(&root_pgt[root_table_offset(PERDOMAIN_VIRT_START)], diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 46ee10a8a4c2..80bf2bf934dd 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #ifdef CONFIG_PV32 @@ -384,7 +385,7 @@ int pv_domain_initialise(struct domain *d) d->arch.ctxt_switch = &pv_csw; - d->arch.pv.flush_root_pt = d->arch.pv.xpti; + d->arch.pv.flush_root_pt = d->arch.pv.xpti || d->arch.asi; if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) @@ -446,7 +447,27 @@ static void _toggle_guest_pt(struct vcpu *v) * to release). Switch to the idle page tables in such an event; the * guest will have been crashed already. */ - cr3 = v->arch.cr3; + if ( v->domain->arch.asi ) + { + /* + * _toggle_guest_pt() might switch between user and kernel page tables, + * but doesn't use write_ptbase(), and hence needs an explicit call to + * sync the shadow L4. + */ + cr3 = __pa(this_cpu(root_pgt)); + if ( v->domain->arch.pv.pcid ) + cr3 |= get_pcid_bits(v, false); + /* + * Ensure the current root page table is already the shadow L4, as + * guest user/kernel switches can only happen once the guest is + * running. + */ + ASSERT(read_cr3() == cr3); + pv_update_shadow_l4(v, false); + } + else + cr3 = v->arch.cr3; + if ( shadow_mode_enabled(v->domain) ) { cr3 &= ~X86_CR3_NOFLUSH; diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c index 24f0d2e4ff7d..c20ce099ae27 100644 --- a/xen/arch/x86/pv/mm.c +++ b/xen/arch/x86/pv/mm.c @@ -11,6 +11,7 @@ #include #include +#include #include #include "mm.h" @@ -103,6 +104,57 @@ void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d) } #endif +void pv_clear_l4_guest_entries(root_pgentry_t *root_pgt) +{ + unsigned int i; + + for ( i = 0; i < ROOT_PAGETABLE_FIRST_XEN_SLOT; i++ ) + l4e_write(&root_pgt[i], l4e_empty()); + for ( i = ROOT_PAGETABLE_LAST_XEN_SLOT + 1; i < L4_PAGETABLE_ENTRIES; i++ ) + l4e_write(&root_pgt[i], l4e_empty()); +} + +void pv_update_shadow_l4(const struct vcpu *v, bool flush) +{ + const root_pgentry_t *guest_pgt = percpu_fix_to_virt(PCPU_FIX_PV_L4SHADOW); + root_pgentry_t *shadow_pgt = this_cpu(root_pgt); + + ASSERT(!v->domain->arch.pv.xpti); + ASSERT(is_pv_vcpu(v)); + ASSERT(!is_idle_vcpu(v)); + + if ( get_cpu_info()->new_cr3 ) + { + percpu_set_fixmap(PCPU_FIX_PV_L4SHADOW, maddr_to_mfn(v->arch.cr3), + __PAGE_HYPERVISOR_RO); + get_cpu_info()->new_cr3 = false; + } + + if ( is_pv_32bit_vcpu(v) ) + { + l4e_write(&shadow_pgt[0], guest_pgt[0]); + l4e_write(&shadow_pgt[root_table_offset(PERDOMAIN_ALT_VIRT_START)], + shadow_pgt[root_table_offset(PERDOMAIN_VIRT_START)]); + } + else + { + unsigned int i; + + for ( i = 0; i < ROOT_PAGETABLE_FIRST_XEN_SLOT; i++ ) + l4e_write(&shadow_pgt[i], guest_pgt[i]); + for ( i = ROOT_PAGETABLE_LAST_XEN_SLOT + 1; + i < L4_PAGETABLE_ENTRIES; i++ ) + l4e_write(&shadow_pgt[i], guest_pgt[i]); + + /* The presence of this Xen slot is selected by the guest. */ + l4e_write(&shadow_pgt[l4_table_offset(RO_MPT_VIRT_START)], + guest_pgt[l4_table_offset(RO_MPT_VIRT_START)]); + } + + if ( flush ) + flush_local(FLUSH_TLB_GLOBAL); +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 40cc14799252..d9841ed3b663 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -829,7 +829,7 @@ int setup_cpu_root_pgt(unsigned int cpu) unsigned int off; int rc; - if ( !opt_xpti_hwdom && !opt_xpti_domu ) + if ( !opt_xpti_hwdom && !opt_xpti_domu && !opt_asi_pv ) return 0; rpt = alloc_xenheap_page(); @@ -839,6 +839,18 @@ int setup_cpu_root_pgt(unsigned int cpu) clear_page(rpt); per_cpu(root_pgt, cpu) = rpt; + if ( opt_asi_pv ) + { + /* + * Populate the Xen slots, the guest ones will be copied from the guest + * root page-table. + */ + init_xen_l4_slots(rpt, _mfn(virt_to_mfn(rpt)), INVALID_MFN, NULL, + false, false, true); + + return 0; + } + rpt[root_table_offset(RO_MPT_VIRT_START)] = idle_pg_table[root_table_offset(RO_MPT_VIRT_START)]; /* SH_LINEAR_PT inserted together with guest mappings. */ @@ -892,6 +904,12 @@ static void cleanup_cpu_root_pgt(unsigned int cpu) per_cpu(root_pgt, cpu) = NULL; + if ( opt_asi_pv ) + { + free_xenheap_page(rpt); + return; + } + for ( r = root_table_offset(DIRECTMAP_VIRT_START); r < root_table_offset(HYPERVISOR_VIRT_END); ++r ) { From patchwork Fri Jul 26 15:22:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 29720C3DA4A for ; Fri, 26 Jul 2024 15:41:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765662.1176322 (Exim 4.92) (envelope-from ) id 1sXN4b-0004sb-W7; Fri, 26 Jul 2024 15:41:37 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765662.1176322; Fri, 26 Jul 2024 15:41:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN4b-0004rZ-RI; Fri, 26 Jul 2024 15:41:37 +0000 Received: by outflank-mailman (input) for mailman id 765662; Fri, 26 Jul 2024 15:41:35 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMvZ-00084Z-Lz for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:17 +0000 Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [2607:f8b0:4864:20::733]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 3baeadbc-4b64-11ef-bbff-fd08da9f4363; Fri, 26 Jul 2024 17:32:14 +0200 (CEST) Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-7a1df0a93eeso54357785a.1 for ; Fri, 26 Jul 2024 08:32:14 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73ed33bsm186293085a.58.2024.07.26.08.32.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:12 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3baeadbc-4b64-11ef-bbff-fd08da9f4363 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007933; x=1722612733; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+bQ+ii7M1KJhsihYtAn+LMm9W2OsmDXTDtNb1K2Kk+8=; b=ULjq/Fq8AEzNonKiqnzDW2K4UctU1Jls2eXHeGiyKUtMbUZ1fup+OMvxdw69kxqRDG bCMe/pFZwI4HvSBRwtfaOF7JekIf5n1R2unNXmCNwGqinsTniSiOC2TA4Xco+ugIfLAM u5or8g1UmGciX16GTqlgQzD4Fo/YzT4aGXeh4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007933; x=1722612733; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+bQ+ii7M1KJhsihYtAn+LMm9W2OsmDXTDtNb1K2Kk+8=; b=XTjbhbKK7gMAVEajiqbuSMrkNWsYY+/q+NRzGjuu6L1kZfruMFba8YEKKSQRpmLG6s FLV0dGsAA9UDttxrigvUXktyBjcKoYbUsLBhhx1H7bgXUMy1q7i97SXDEyQMM7I+wGQg LRup8Nr2feEDEhm6MnvwPgVeWCllxLB3hZh8mX8faIj08syDoM3Lv6hrIpm9HBuK0CMG JkVonKdyFAAaRQYotYNRcQ01crPOX/xwEYriGh8tZNtOsXiK8RypHLfR/jlF1Z+wsLsH xeyFum5eVkyYmTb3nKjufn8jeKA4S2ISL+tHP8j/H9Lf4lbDBsMQT17UxIdpEBf1Oib8 5sEQ== X-Gm-Message-State: AOJu0Yy7/FtcEumP2xkLgioa0Va7WzY0j4PckaIZFWekavioZO5SCYij ZjVUQy2FSKbZ7SnQtcv63lCISF2z0JHgNobxN84iJdcPkactxJzsendIQEIHBK69grEVz963CHk + X-Google-Smtp-Source: AGHT+IFEl+l6gfhV/oaxuOmcOHpAbe9haRO9XmK8X2xunfJzUIgKMhf39uziKga6rh7b4MNPD63eCg== X-Received: by 2002:a05:620a:4152:b0:79f:fe8:5fce with SMTP id af79cd13be357-7a1e522fb05mr11382285a.3.1722007932948; Fri, 26 Jul 2024 08:32:12 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper , Julien Grall , Stefano Stabellini , "Daniel P. Smith" , =?utf-8?q?Marek_Marczykow?= =?utf-8?q?ski-G=C3=B3recki?= Subject: [PATCH 21/22] x86/mm: switch to a per-CPU mapped stack when using ASI Date: Fri, 26 Jul 2024 17:22:05 +0200 Message-ID: <20240726152206.28411-22-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 When using ASI the CPU stack is mapped using a range of fixmap entries in the per-CPU region. This ensures the stack is only accessible by the current CPU. Note however there's further work required in order to allocate the stack from domheap instead of xenheap, and ensure the stack is not part of the direct map. For domains not running with ASI enabled all the CPU stacks are mapped in the per-domain L3, so that the stack is always at the same linear address, regardless of whether ASI is enabled or not for the domain. When calling UEFI runtime methods the current per-domain slot needs to be added to the EFI L4, so that the stack is available in UEFI. Finally, some users of callfunc IPIs pass parameters from the stack, so when handling a callfunc IPI the stack of the caller CPU is mapped into the address space of the CPU handling the IPI. This needs further work to use a bounce buffer in order to avoid having to map remote CPU stacks. Signed-off-by: Roger Pau Monné --- There's also further work required in order to avoid mapping remote stack when handling callfunc IPIs. --- xen/arch/x86/domain.c | 12 +++ xen/arch/x86/include/asm/current.h | 5 ++ xen/arch/x86/include/asm/fixmap.h | 5 ++ xen/arch/x86/include/asm/mm.h | 6 +- xen/arch/x86/include/asm/smp.h | 12 +++ xen/arch/x86/mm.c | 125 +++++++++++++++++++++++++++-- xen/arch/x86/setup.c | 27 +++++-- xen/arch/x86/smp.c | 29 +++++++ xen/arch/x86/smpboot.c | 47 ++++++++++- xen/arch/x86/traps.c | 6 +- xen/common/efi/runtime.c | 12 +++ xen/common/smp.c | 10 +++ xen/include/xen/smp.h | 5 ++ 13 files changed, 281 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 94a42ef29cd1..d00ba415877f 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -929,6 +929,18 @@ int arch_domain_create(struct domain *d, d->arch.msr_relaxed = config->arch.misc_flags & XEN_X86_MSR_RELAXED; + if ( !d->arch.asi && (opt_asi_hvm || opt_asi_pv ) ) + { + /* + * This domain is not using ASI, but other domains on the system + * possibly are, hence the CPU stacks are on the per-CPU page-table + * region. Add an L3 entry that has all the stacks mapped. + */ + rc = map_all_stacks(d); + if ( rc ) + goto fail; + } + return 0; fail: diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index 6a021607a1a9..75b9a341f814 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -24,6 +24,11 @@ * 0 - IST Shadow Stacks (4x 1k, read-only) */ +static inline bool is_shstk_slot(unsigned int i) +{ + return (i == 0 || i == PRIMARY_SHSTK_SLOT); +} + /* * Identify which stack page the stack pointer is on. Returns an index * as per the comment above. diff --git a/xen/arch/x86/include/asm/fixmap.h b/xen/arch/x86/include/asm/fixmap.h index bc68a98568ae..d52c1886fcdd 100644 --- a/xen/arch/x86/include/asm/fixmap.h +++ b/xen/arch/x86/include/asm/fixmap.h @@ -120,6 +120,11 @@ extern void __set_fixmap_x( /* per-CPU fixmap area. */ enum percpu_fixed_addresses { + /* For alignment reasons the per-CPU stacks must come first. */ + PCPU_STACK_START, + PCPU_STACK_END = PCPU_STACK_START + NR_CPUS * (1U << STACK_ORDER) - 1, +#define PERCPU_STACK_IDX(c) (PCPU_STACK_START + (c) * (1U << STACK_ORDER)) +#define PERCPU_STACK_ADDR(c) percpu_fix_to_virt(PERCPU_STACK_IDX(c)) PCPU_FIX_PV_L4SHADOW, __end_of_percpu_fixed_addresses }; diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index f883468b1a7c..b4f1e0399275 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -521,7 +521,7 @@ extern struct rangeset *mmio_ro_ranges; #define compat_pfn_to_cr3(pfn) (((unsigned)(pfn) << 12) | ((unsigned)(pfn) >> 20)) #define compat_cr3_to_pfn(cr3) (((unsigned)(cr3) >> 12) | ((unsigned)(cr3) << 20)) -void memguard_guard_stack(void *p); +void memguard_guard_stack(void *p, unsigned int cpu); void memguard_unguard_stack(void *p); struct mmio_ro_emulate_ctxt { @@ -652,4 +652,8 @@ static inline int destroy_xen_mappings_cpu(unsigned long s, unsigned long e, return modify_xen_mappings_cpu(s, e, _PAGE_NONE, cpu); } +/* Setup a per-domain slot that maps all pCPU stacks. */ +int map_all_stacks(struct domain *d); +int add_stack(const void *stack, unsigned int cpu); + #endif /* __ASM_X86_MM_H__ */ diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index c8c79601343d..a17c609da4b6 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -79,6 +79,18 @@ extern bool unaccounted_cpus; void *cpu_alloc_stack(unsigned int cpu); +/* + * Setup the per-CPU area stack mappings. + * + * @dest_cpu: CPU where the mappings are to appear. + * @stack_cpu: CPU whose stacks should be mapped. + */ +void cpu_set_stack_mappings(unsigned int dest_cpu, unsigned int stack_cpu); + +#define HAS_ARCH_SMP_CALLFUNC +void arch_smp_pre_callfunc(unsigned int cpu); +void arch_smp_post_callfunc(unsigned int cpu); + #endif /* !__ASSEMBLY__ */ #endif diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 8fea7465a9df..67ffdebb595e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -87,6 +87,7 @@ * doing the final put_page(), and remove it from the iommu if so. */ +#include #include #include #include @@ -6352,31 +6353,40 @@ void free_perdomain_mappings(struct domain *d) d->arch.perdomain_l3_pg = NULL; } -static void write_sss_token(unsigned long *ptr) +static void write_sss_token(unsigned long *ptr, unsigned long va) { /* * A supervisor shadow stack token is its own linear address, with the * busy bit (0) clear. */ - *ptr = (unsigned long)ptr; + *ptr = va; } -void memguard_guard_stack(void *p) +void memguard_guard_stack(void *p, unsigned int cpu) { + unsigned long va = + (opt_asi_hvm || opt_asi_pv) ? (unsigned long)PERCPU_STACK_ADDR(cpu) + : (unsigned long)p; + /* IST Shadow stacks. 4x 1k in stack page 0. */ if ( IS_ENABLED(CONFIG_XEN_SHSTK) ) { - write_sss_token(p + (IST_MCE * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_NMI * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_DB * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_DF * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_MCE * IST_SHSTK_SIZE) - 8, + va + (IST_MCE * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_NMI * IST_SHSTK_SIZE) - 8, + va + (IST_NMI * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_DB * IST_SHSTK_SIZE) - 8, + va + (IST_DB * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_DF * IST_SHSTK_SIZE) - 8, + va + (IST_DF * IST_SHSTK_SIZE) - 8); } map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_SHSTK); /* Primary Shadow Stack. 1x 4k in stack page 5. */ p += PRIMARY_SHSTK_SLOT * PAGE_SIZE; + va += PRIMARY_SHSTK_SLOT * PAGE_SIZE; if ( IS_ENABLED(CONFIG_XEN_SHSTK) ) - write_sss_token(p + PAGE_SIZE - 8); + write_sss_token(p + PAGE_SIZE - 8, va + PAGE_SIZE - 8); map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_SHSTK); } @@ -6567,6 +6577,105 @@ void setup_perdomain_slot(const struct vcpu *v, root_pgentry_t *root_pgt) root_pgt[root_table_offset(PERDOMAIN_VIRT_START)]); } +static struct page_info *l2_all_stacks; + +int add_stack(const void *stack, unsigned int cpu) +{ + unsigned long va = (unsigned long)PERCPU_STACK_ADDR(cpu); + struct page_info *pg; + l2_pgentry_t *l2tab = NULL; + l1_pgentry_t *l1tab = NULL; + unsigned int nr; + int rc = 0; + + /* + * Assume CPU stack allocation is always serialized, either because it's + * done on the BSP during boot, or in case of hotplug, in stop machine + * context. + */ + ASSERT(system_state < SYS_STATE_active || cpu_in_hotplug_context()); + + if ( !opt_asi_hvm && !opt_asi_pv ) + return 0; + + if ( !l2_all_stacks ) + { + l2_all_stacks = alloc_domheap_page(NULL, MEMF_no_owner); + if ( !l2_all_stacks ) + return -ENOMEM; + l2tab = __map_domain_page(l2_all_stacks); + clear_page(l2tab); + } + else + l2tab = __map_domain_page(l2_all_stacks); + + /* code assumes all the stacks can be mapped with a single l2. */ + ASSERT(l3_table_offset((unsigned long)percpu_fix_to_virt(PCPU_STACK_END)) == + l3_table_offset((unsigned long)percpu_fix_to_virt(PCPU_STACK_START))); + for ( nr = 0 ; nr < (1U << STACK_ORDER) ; nr++) + { + l2_pgentry_t *pl2e = l2tab + l2_table_offset(va); + + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) + { + pg = alloc_domheap_page(NULL, MEMF_no_owner); + if ( !pg ) + { + rc = -ENOMEM; + break; + } + l1tab = __map_domain_page(pg); + clear_page(l1tab); + l2e_write(pl2e, l2e_from_page(pg, __PAGE_HYPERVISOR_RW)); + } + else if ( !l1tab ) + l1tab = map_l1t_from_l2e(*pl2e); + + l1e_write(&l1tab[l1_table_offset(va)], + l1e_from_mfn(virt_to_mfn(stack), + is_shstk_slot(nr) ? __PAGE_HYPERVISOR_SHSTK + : __PAGE_HYPERVISOR_RW)); + + va += PAGE_SIZE; + stack += PAGE_SIZE; + + if ( !l1_table_offset(va) ) + { + unmap_domain_page(l1tab); + l1tab = NULL; + } + } + + unmap_domain_page(l1tab); + unmap_domain_page(l2tab); + /* + * Don't care to free the intermediate page-tables on failure, can be used + * to map other stacks. + */ + + return rc; +} + +int map_all_stacks(struct domain *d) +{ + /* + * Create the per-domain L3. Pass a dummy PERDOMAIN_VIRT_START, but note + * only the per-domain L3 is allocated when nr == 0. + */ + int rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL); + l3_pgentry_t *l3tab; + + if ( rc ) + return rc; + + l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + l3tab[l3_table_offset((unsigned long)percpu_fix_to_virt(PCPU_STACK_START))] + = l3e_from_page(l2_all_stacks, __PAGE_HYPERVISOR_RW); + unmap_domain_page(l3tab); + + return 0; +} + static void __init __maybe_unused build_assertions(void) { /* diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 5bf81b81b46f..76f7d71b8c1c 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -808,8 +808,6 @@ static void __init noreturn reinit_bsp_stack(void) /* Update SYSCALL trampolines */ percpu_traps_init(); - stack_base[0] = stack; - rc = setup_cpu_root_pgt(0); if ( rc ) panic("Error %d setting up PV root page table\n", rc); @@ -1771,10 +1769,6 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) system_state = SYS_STATE_boot; - bsp_stack = cpu_alloc_stack(0); - if ( !bsp_stack ) - panic("No memory for BSP stack\n"); - console_init_ring(); vesa_init(); @@ -1961,6 +1955,16 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) alternative_branches(); + /* + * Alloc the BSP stack closer to the point where the AP ones also get + * allocated - and after the speculation mitigations have been initialized. + * In order to set up the shadow stack token correctly Xen needs to know + * whether per-CPU mapped stacks are being used. + */ + bsp_stack = cpu_alloc_stack(0); + if ( !bsp_stack ) + panic("No memory for BSP stack\n"); + /* * Setup the local per-domain L3 for the BSP also, so it matches the state * of the APs. @@ -2065,8 +2069,17 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) info->last_spec_ctrl = default_xen_spec_ctrl; } + stack_base[0] = bsp_stack; + /* Copy the cpu info block, and move onto the BSP stack. */ - bsp_info = get_cpu_info_from_stack((unsigned long)bsp_stack); + if ( opt_asi_hvm || opt_asi_pv ) + { + cpu_set_stack_mappings(0, 0); + bsp_info = get_cpu_info_from_stack((unsigned long)PERCPU_STACK_ADDR(0)); + } + else + bsp_info = get_cpu_info_from_stack((unsigned long)bsp_stack); + *bsp_info = *info; asm volatile ("mov %[stk], %%rsp; jmp %c[fn]" :: diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index 04c6a0572319..18a7196195cf 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -433,3 +434,31 @@ long cf_check cpu_down_helper(void *data) ret = cpu_down(cpu); return ret; } + +void arch_smp_pre_callfunc(unsigned int cpu) +{ + if ( (!opt_asi_pv && !opt_asi_hvm) || cpu == smp_processor_id() || + (!current->domain->arch.asi && !is_idle_vcpu(current)) || + /* + * CPU#0 still runs on the .init stack when the APs are started, don't + * attempt to map such stack. + */ + (!cpu && system_state < SYS_STATE_active) ) + return; + + cpu_set_stack_mappings(smp_processor_id(), cpu); +} + +void arch_smp_post_callfunc(unsigned int cpu) +{ + unsigned int i; + + if ( (!opt_asi_pv && !opt_asi_hvm) || cpu == smp_processor_id() || + (!current->domain->arch.asi && !is_idle_vcpu(current)) ) + return; + + for ( i = 0; i < (1U << STACK_ORDER); i++ ) + percpu_clear_fixmap(PERCPU_STACK_IDX(cpu) + i); + + flush_area_local(PERCPU_STACK_ADDR(cpu), FLUSH_ORDER(STACK_ORDER)); +} diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index d9841ed3b663..548e3102101c 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -579,7 +579,20 @@ static int do_boot_cpu(int apicid, int cpu) printk("Booting processor %d/%d eip %lx\n", cpu, apicid, start_eip); - stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); + if ( opt_asi_hvm || opt_asi_pv ) + { + /* + * Uniformly run with the stack mapping of the per-CPU area (including + * the idle vCPU) if ASI is enabled for any domain type. + */ + cpu_set_stack_mappings(cpu, cpu); + + ASSERT(IS_ALIGNED((unsigned long)PERCPU_STACK_ADDR(cpu), STACK_SIZE)); + + stack_start = PERCPU_STACK_ADDR(cpu) + STACK_SIZE - sizeof(struct cpu_info); + } + else + stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); /* * If per-CPU idle root page table has been allocated, switch to it as @@ -1053,11 +1066,41 @@ void *cpu_alloc_stack(unsigned int cpu) stack = alloc_xenheap_pages(STACK_ORDER, memflags); if ( stack ) - memguard_guard_stack(stack); + { + int rc = add_stack(stack, cpu); + + if ( rc ) + { + printk(XENLOG_ERR "unable to map stack for CPU %u: %d\n", cpu, rc); + free_xenheap_pages(stack, STACK_ORDER); + return NULL; + } + memguard_guard_stack(stack, cpu); + } return stack; } +void cpu_set_stack_mappings(unsigned int dest_cpu, unsigned int stack_cpu) +{ + unsigned int i; + + for ( i = 0; i < (1U << STACK_ORDER); i++ ) + { + unsigned int flags = (is_shstk_slot(i) ? __PAGE_HYPERVISOR_SHSTK + : __PAGE_HYPERVISOR_RW) | + (dest_cpu == stack_cpu ? _PAGE_GLOBAL : 0); + + if ( is_shstk_slot(i) && dest_cpu != stack_cpu ) + continue; + + percpu_set_fixmap_remote(dest_cpu, PERCPU_STACK_IDX(stack_cpu) + i, + _mfn(virt_to_mfn(stack_base[stack_cpu] + + i * PAGE_SIZE)), + flags); + } +} + static int cpu_smpboot_alloc(unsigned int cpu) { struct cpu_info *info; diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index b4fb95917023..28513c0e3d6a 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -609,10 +609,12 @@ void show_stack_overflow(unsigned int cpu, const struct cpu_user_regs *regs) unsigned long esp = regs->rsp; unsigned long curr_stack_base = esp & ~(STACK_SIZE - 1); unsigned long esp_top, esp_bottom; + const void *stack = current->domain->arch.asi ? PERCPU_STACK_ADDR(cpu) + : stack_base[cpu]; - if ( _p(curr_stack_base) != stack_base[cpu] ) + if ( _p(curr_stack_base) != stack ) printk("Current stack base %p differs from expected %p\n", - _p(curr_stack_base), stack_base[cpu]); + _p(curr_stack_base), stack); esp_bottom = (esp | (STACK_SIZE - 1)) + 1; esp_top = esp_bottom - PRIMARY_STACK_SIZE; diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c index d952c3ba785e..3a8233ed62ac 100644 --- a/xen/common/efi/runtime.c +++ b/xen/common/efi/runtime.c @@ -32,6 +32,7 @@ void efi_rs_leave(struct efi_rs_state *state); #ifndef CONFIG_ARM # include +# include # include # include #endif @@ -85,6 +86,7 @@ struct efi_rs_state efi_rs_enter(void) static const u16 fcw = FCW_DEFAULT; static const u32 mxcsr = MXCSR_DEFAULT; struct efi_rs_state state = { .cr3 = 0 }; + root_pgentry_t *efi_pgt, *idle_pgt; if ( mfn_eq(efi_l4_mfn, INVALID_MFN) ) return state; @@ -98,6 +100,16 @@ struct efi_rs_state efi_rs_enter(void) efi_rs_on_cpu = smp_processor_id(); + if ( opt_asi_pv || opt_asi_hvm ) + { + /* Insert the idle per-domain slot for the stack mapping. */ + efi_pgt = map_domain_page(efi_l4_mfn); + idle_pgt = maddr_to_virt(idle_vcpu[efi_rs_on_cpu]->arch.cr3); + efi_pgt[root_table_offset(PERDOMAIN_VIRT_START)].l4 = + idle_pgt[root_table_offset(PERDOMAIN_VIRT_START)].l4; + unmap_domain_page(efi_pgt); + } + /* prevent fixup_page_fault() from doing anything */ irq_enter(); diff --git a/xen/common/smp.c b/xen/common/smp.c index a011f541f1ea..04f5aede0d3d 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -29,6 +29,7 @@ static struct call_data_struct { void (*func) (void *info); void *info; int wait; + unsigned int caller; cpumask_t selected; } call_data; @@ -63,6 +64,7 @@ void on_selected_cpus( call_data.func = func; call_data.info = info; call_data.wait = wait; + call_data.caller = smp_processor_id(); smp_send_call_function_mask(&call_data.selected); @@ -82,6 +84,12 @@ void smp_call_function_interrupt(void) if ( !cpumask_test_cpu(cpu, &call_data.selected) ) return; + /* + * TODO: use bounce buffers to pass callfunc data, so that when using ASI + * there's no need to map remote CPU stacks. + */ + arch_smp_pre_callfunc(call_data.caller); + irq_enter(); if ( unlikely(!func) ) @@ -102,6 +110,8 @@ void smp_call_function_interrupt(void) } irq_exit(); + + arch_smp_post_callfunc(call_data.caller); } /* diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index 2ca9ff1bfcc1..610c279ca24c 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -76,4 +76,9 @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +#ifndef HAS_ARCH_SMP_CALLFUNC +static inline void arch_smp_pre_callfunc(unsigned int cpu) {} +static inline void arch_smp_post_callfunc(unsigned int cpu) {} +#endif + #endif /* __XEN_SMP_H__ */ From patchwork Fri Jul 26 15:22:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13742914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E2D3C3DA4A for ; Fri, 26 Jul 2024 15:38:47 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.765593.1176266 (Exim 4.92) (envelope-from ) id 1sXN1f-0000Vt-Aq; Fri, 26 Jul 2024 15:38:35 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 765593.1176266; Fri, 26 Jul 2024 15:38:35 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXN1f-0000Vm-86; Fri, 26 Jul 2024 15:38:35 +0000 Received: by outflank-mailman (input) for mailman id 765593; Fri, 26 Jul 2024 15:38:34 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sXMva-00084T-CI for xen-devel@lists.xenproject.org; Fri, 26 Jul 2024 15:32:18 +0000 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [2607:f8b0:4864:20::82e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 3ce06ce3-4b64-11ef-8776-851b0ebba9a2; Fri, 26 Jul 2024 17:32:16 +0200 (CEST) Received: by mail-qt1-x82e.google.com with SMTP id d75a77b69052e-44fe11dedb3so3920871cf.1 for ; Fri, 26 Jul 2024 08:32:16 -0700 (PDT) Received: from localhost ([213.195.124.163]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe817b704sm14134541cf.49.2024.07.26.08.32.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 08:32:14 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3ce06ce3-4b64-11ef-8776-851b0ebba9a2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1722007935; x=1722612735; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ujxzWIZCxxgOdDMzOO3f1hDHb8R0sblL5/6JViBiwPw=; b=uFw+QpwwhbidcC77uubDdoH10eC9LJ8xXKwzJft22lAaUcosgJsLS7RCHZUjkpVGAS iHxdSEKLPSAVfDq255QRkfR+413yKhucswkHIcSknLgWYBGE4MmbR+de0Bt8hlzFGdiX ZzeB5uikKdL/4lSZILCpvqB1FENyTjSOctGUY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722007935; x=1722612735; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ujxzWIZCxxgOdDMzOO3f1hDHb8R0sblL5/6JViBiwPw=; b=h2KjDLnxAGZW+OlhgD9z0thvgimw98plLnZsK8n6WhBNykZpgFPE0yIgClpwHRXICn l628cNbawypjDA8dH4Pyx/Q0gWifQ2Pg+5R7tnpPn4MexWDR258kndkiajLvqiL8qnqY ciEqWudbTUJU3ciM2Jdd0CM/ni8HxsP29/tdrXUPqfV+wVWGfERqer4g6C+O3FLGkPt5 Grrwy7dZxWBDVdogq6mTW2zuXwU7tjqJTopHpR3TUnicOCGAR3UXF/O/BCv24Guxn5Uv OEmXsWMmomjZspvFX0kS6jgTc094fdkWYzEQQ/BRvJjtUbEAEnxqmCzDks1OcUTOWaZi HBiA== X-Gm-Message-State: AOJu0YyJ85eMMOJGVyTymBUXyuCB0NWg8/g4RTL+8PlHl0TuF6AIKWmI 7KVAGV9N8L+mGNfiSJFAQbiwZapPXzo0JX10Y0GsAPj7PqvmpcQrMdXE/UzPkXNjieVbstlAyyJ T X-Google-Smtp-Source: AGHT+IEN0Eq8CDQ5Cb603qPnp/JwUedIQCyO4aYdsNBLQFTLNZuf7mCOphfab1I5CFzTJfvm3Gvhew== X-Received: by 2002:a05:622a:1309:b0:447:df6b:b8c5 with SMTP id d75a77b69052e-45004db298amr1640751cf.33.1722007935174; Fri, 26 Jul 2024 08:32:15 -0700 (PDT) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: alejandro.vallejo@cloud.com, Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 22/22] x86/mm: zero stack on stack switch or reset Date: Fri, 26 Jul 2024 17:22:06 +0200 Message-ID: <20240726152206.28411-23-roger.pau@citrix.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240726152206.28411-1-roger.pau@citrix.com> References: <20240726152206.28411-1-roger.pau@citrix.com> MIME-Version: 1.0 With the stack mapped on a per-CPU basis there's no risk of other CPUs being able to read the stack contents, but vCPUs running on the current pCPU could read stack rubble from operations of previous vCPUs. The #DF stack is not zeroed because handling of #DF results in a panic. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/current.h | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index 75b9a341f814..02b4118b03ef 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -177,6 +177,14 @@ unsigned long get_stack_dump_bottom (unsigned long sp); # define SHADOW_STACK_WORK "" #endif +#define ZERO_STACK \ + "test %[stk_size], %[stk_size];" \ + "jz .L_skip_zeroing.%=;" \ + "std;" \ + "rep stosb;" \ + "cld;" \ + ".L_skip_zeroing.%=:" + #if __GNUC__ >= 9 # define ssaj_has_attr_noreturn(fn) __builtin_has_attribute(fn, __noreturn__) #else @@ -187,10 +195,24 @@ unsigned long get_stack_dump_bottom (unsigned long sp); #define switch_stack_and_jump(fn, instr, constr) \ ({ \ unsigned int tmp; \ + bool zero_stack = current->domain->arch.asi; \ BUILD_BUG_ON(!ssaj_has_attr_noreturn(fn)); \ + ASSERT(IS_ALIGNED((unsigned long)guest_cpu_user_regs() - \ + PRIMARY_STACK_SIZE + \ + sizeof(struct cpu_info), PAGE_SIZE)); \ + if ( zero_stack ) \ + { \ + unsigned long stack_top = get_stack_bottom() & \ + ~(STACK_SIZE - 1); \ + \ + clear_page((void *)stack_top + IST_MCE * PAGE_SIZE); \ + clear_page((void *)stack_top + IST_NMI * PAGE_SIZE); \ + clear_page((void *)stack_top + IST_DB * PAGE_SIZE); \ + } \ __asm__ __volatile__ ( \ SHADOW_STACK_WORK \ "mov %[stk], %%rsp;" \ + ZERO_STACK \ CHECK_FOR_LIVEPATCH_WORK \ instr "[fun]" \ : [val] "=&r" (tmp), \ @@ -201,7 +223,13 @@ unsigned long get_stack_dump_bottom (unsigned long sp); ((PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8), \ [stack_mask] "i" (STACK_SIZE - 1), \ _ASM_BUGFRAME_INFO(BUGFRAME_bug, __LINE__, \ - __FILE__, NULL) \ + __FILE__, NULL), \ + /* For stack zeroing. */ \ + "D" ((void *)guest_cpu_user_regs() - 1), \ + [stk_size] "c" \ + (zero_stack ? PRIMARY_STACK_SIZE - sizeof(struct cpu_info)\ + : 0), \ + "a" (0) \ : "memory" ); \ unreachable(); \ })