From patchwork Mon Apr 25 08:42:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12825406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5CC3C433F5 for ; Mon, 25 Apr 2022 08:43:07 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.312577.529885 (Exim 4.92) (envelope-from ) id 1niuJ6-0005m4-76; Mon, 25 Apr 2022 08:42:56 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 312577.529885; Mon, 25 Apr 2022 08:42:56 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1niuJ6-0005lx-3j; Mon, 25 Apr 2022 08:42:56 +0000 Received: by outflank-mailman (input) for mailman id 312577; Mon, 25 Apr 2022 08:42:55 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1niuJ5-0005as-70 for xen-devel@lists.xenproject.org; Mon, 25 Apr 2022 08:42:55 +0000 Received: from de-smtp-delivery-102.mimecast.com (de-smtp-delivery-102.mimecast.com [194.104.111.102]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id b29b928a-c473-11ec-8fc2-03012f2f19d4; Mon, 25 Apr 2022 10:42:54 +0200 (CEST) Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04lp2057.outbound.protection.outlook.com [104.47.14.57]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id de-mta-38-GNVznJqEMCyrHfr4cwFnUw-1; Mon, 25 Apr 2022 10:42:51 +0200 Received: from DU2PR04MB8616.eurprd04.prod.outlook.com (2603:10a6:10:2db::16) by AM6PR04MB6647.eurprd04.prod.outlook.com (2603:10a6:20b:f2::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5186.21; Mon, 25 Apr 2022 08:42:49 +0000 Received: from DU2PR04MB8616.eurprd04.prod.outlook.com ([fe80::5cb0:5195:4203:7c2f]) by DU2PR04MB8616.eurprd04.prod.outlook.com ([fe80::5cb0:5195:4203:7c2f%9]) with mapi id 15.20.5186.021; Mon, 25 Apr 2022 08:42:49 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: b29b928a-c473-11ec-8fc2-03012f2f19d4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=mimecast20200619; t=1650876174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F76uggdKpcuSYHZPwCK0FV2JMpLEoy075SIPYzCN2oY=; b=aHt0+UEKQ5szrRaM/h7CXa47hjLmRn1XOA9ssQzOkcVIYIrXHiEz0uRpdNfjmTdcEXnFIw DI1dMvwvY/FkFGZqH6Sb1bv7Z7zW/w6QwDJeKG18o+PXr+wSEml6gyRqZblSHrO3EdDucw h2PMLOLjfEU7rNDkQVre18RRe9uavs0= X-MC-Unique: GNVznJqEMCyrHfr4cwFnUw-1 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IwIt75MDAsiF5vuXMEKM7Qs+6xXxJvstqdESHtOtEwDLFLjwSSEFbGjSRN97SO7fthl7A00hE/kQdhmu4/fnuQ8CULnCFlXWgDkL/odcVYm+sviWr5/OBkc56EvarFy+JyiNXunjq/wK4+0B6MkcQfXNFvaIGkXRj3FUPeEM8d3AaP/k7Ktk5+xfSUSURSnWMAIJLDktKyNb96SgacdOPoJVovH0P6Cau75VKGOAY8cLchySdky7j2AYWFok0aTjLmdeyEFvVEKW3aGiaCOlYSQU7g7QtLhz7NpzDiusy+7bVE6LcEoC1LGVybKdqJIGO6ELqasXif/BYww+Dx/9uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=F76uggdKpcuSYHZPwCK0FV2JMpLEoy075SIPYzCN2oY=; b=CMkyhM7LTzpt/ZYD1AKt5b6zsuaGrv43WnK6Zjz7u6vRl9zfWCFmTmkaFMDjIPHHRcM+Ak2136f9pEDzP2r0cehOsB1BH5HCuWCWUIjMSnYKcgb8cwD+K/y/QY8ut2IKHmmRxBjyT5YjjZn2acCnU69Jg+eTVL6mzPRaYgpHzTvRknrQqSYZAvkFpVVRZZiZaq7nvo/ESTe6RIR4m84sEnClCO3F/G98gFoyu5sBSX55UIoMgj2+gO8WFbIGqruxF0PRmsRBlCA6XCMXwTLCfYLlyx8K5JrU6qrdppfRuU+AyvnqfJbRg+TdsOBOcOtUoVinldcYz922sv8xETHUGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: Date: Mon, 25 Apr 2022 10:42:50 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: [PATCH v4 16/21] VT-d: free all-empty page tables Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Paul Durrant , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Kevin Tian References: In-Reply-To: X-ClientProxiedBy: AS9PR04CA0033.eurprd04.prod.outlook.com (2603:10a6:20b:46a::8) To DU2PR04MB8616.eurprd04.prod.outlook.com (2603:10a6:10:2db::16) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1f36f1ab-8f0d-44cf-4b65-08da269793cc X-MS-TrafficTypeDiagnostic: AM6PR04MB6647:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +sZCJwKu+kKqohWo2wnpRtkG+226O4gDKlAtsuJIqnk6FjMeovxdmo9YulYupX9OPLn6n5NO0Hut7derq1vXpL6swurkfUaU/VJ3MICXKVkbaXw1gR0hWKgVsfdnzFoynS7dKGn+r5acRHdsoVEauALfhsqfnzK00kOIfk/tM2c6c4bevJ0noXrpJRdXpzUgTdATF30uSaLV6qhH0B74J+ZKs9ueRvFeBZqwKzZdzqC27CK+XPzxBsuYt8EuF6nvIUOLo1hXVHeq+pwZ1GwmkdbvsWzdgjmiWPz/k4dtZ6OhrOZoLDoiKSsY/ZxLcChd4h2MJBy8d/6jZWp1TCnVogvDP0ibWdD26nZmD/r+kZHzAc7V/8t5aTb7rMZz+ST44TAF1VrXT0yFawWMn0VsYcEeUokz5NYe5emXbp6K/jqUPtTmuwJo/l620EsFUw/WMdlhq6Y5WSyIq/uP84T4y+jaqX+yRcTE4pmKjeCO9RLKDUJhq4HWnCma+HasPxv3X/AwCJDcSxhjaffLF2pAO0cPQsgk77pIKXQPDnh158H3gLku2rXUq6pYpR7sHEnEhVpMF0jTIC1RbXRLDkjY43E51cAVdzw/Mr3V8Fdpy7XqbbDymywUB4+amsZLvRAOJ2Pd3xKPuQv2Jps+ZoxTnfMQad7Xo6RFTTQKsuQpgYtC4vhfhP1wwDISLIL7l8EHKWAlQ1WkEL49c5X/JdziyYliCRB+KW1iTPMDn3D4WdQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8616.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(6512007)(66556008)(66476007)(4326008)(2616005)(31696002)(8676002)(38100700002)(86362001)(26005)(6666004)(6506007)(6486002)(2906002)(83380400001)(5660300002)(508600001)(66946007)(36756003)(316002)(8936002)(186003)(31686004)(6916009)(54906003)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?h/6wSbtJ0v/TQ9fUp//L8KRfmcu5?= =?utf-8?q?RJcrOU9XgAUGX8BrYgEUZczmB/fo9RxeDYjXmC7ow61JXKUfsT21RBRlgO/jEfklr?= =?utf-8?q?r8/s9uvQ6dmDDvvmNH4U6dngM/nb/PdpAYJ98rvSM/ua4vNgA4DVZAYdeei+DuN7R?= =?utf-8?q?HkPgaB2zqGRMkwekqYEZj/L07WcxhR3zNFIl1wTdeQs/Jk8aCeHHQBRESSdiH7jbs?= =?utf-8?q?ebmusCuoNs7h4SoVUqVdm2QeCA2q/4+pfhuOUsigSVvGgn3VryIpjFM1xOxu4hH1o?= =?utf-8?q?iyGB5gigr/85Uy5PwaDQpdQHS5akZdBi5xCT2Ut8cJ7VLlXHiyfPt5F2/ODLskydX?= =?utf-8?q?W31e0tMp5jhWO7ChA9HSjrkaosEHaTAgRDl4xd984Fj8F25zNHQOdQt2ayirm2E6U?= =?utf-8?q?N1w9mu8dAu9wsm8fwAsWwXA0x82KVU7XiFY56Q1n9kLG4NvspG0cPTVGb0oeuAq5H?= =?utf-8?q?bU0t2TA8lsuTo8JBvmmE/3uQETJ8U1Tboxs6knDY2axeqrI8E9O6OTrnICEDJbfj5?= =?utf-8?q?i3oTQy47WrKIH9ahAod6gLN3vhqPtdnExdCu7HKH1Fr5Mu2p7yogB6o83Lny1rKt1?= =?utf-8?q?lXlws2ffT9Nf2lwhAVm9giNxeE+j7xxDTdV2hoWCB+zbSfnvALnvkyhsjnq1iyInm?= =?utf-8?q?16XusvZdaxNjKSGMN9eTYJb3fj6p9Jw4M057qYl52o8CO/nwZZan1sw+KMAzVhbzS?= =?utf-8?q?vwRrMeGOS8aEbRYMKnARzerArG0Enuh4Vd4eTZxGsgDQ+EFLI2lopow0zn1KtuY+E?= =?utf-8?q?pqD3mR1GWxawACXnd7+Qv64dDFnvweTF+eKuXLMSfmdzEYZ2YPcmc8SrJvaROWRxQ?= =?utf-8?q?eQmVzLge3aNQwgmkueyCowpL5n85ughsalAcRW0wwOAZdcvo7B0gEC4dswChIJ3Aw?= =?utf-8?q?Kcq00NyVbJ9zAZXq11KEAhHAbrQebBDjc81hjRpM6Y+Pu9hsL1yAg+DOOXtvIsJhk?= =?utf-8?q?2Apzk51J63LpE/tJEOjsiDEQHyjffbbYygIinwQy/7vX6T5lUgtwD/w2kyVv02Huh?= =?utf-8?q?0/i4+zsWSUMnTIoMHkXGUvTi73oWy/S0cPlFEyzQ4oK483mCkvbNXwqFngq6nQoYF?= =?utf-8?q?l9XhBfndoRNFvQnONdZQ41ofZWKEy+0TvQGum3+W7ZlByEbQ9ryeQOvZNaEPUZ118?= =?utf-8?q?eiSYppVxpl++mZYwFCxhRXhl64hj4sr5ujOnxQPtlVDFv+4pPgEqJzYtef0BwnFQU?= =?utf-8?q?ZBOEh2RvTS9VGuw/RofbC9vZyZ8M/PzBdvkv3LiX0Rmrci1D1hjT6UOdujzdg0AtN?= =?utf-8?q?uzbVAB0c/kp+3nzutlnIqXOxS//ZtVRpvNHa7xLJufA1EOhVID3/mQodWUBHqnBZM?= =?utf-8?q?zKHC2h8UjJurF6Zp+vfPUV374t0rvbID1N7yxvLKh8gh5RtOnk9QzbAUwi36p6rnW?= =?utf-8?q?zRTeAapTTwHf3i+smStLvqk6Rg6yxYL4QTYJmutBnCiiSVfziODOcDIK5CHq9DfhB?= =?utf-8?q?BR4ycXBiDVyqPB6mROZHXHE0/tjaIiyAR50c90knuvOcXywWI15n3IR0hSJ1x1kBB?= =?utf-8?q?LYgk0FfhJsWBjEcG18AFcM7AAtOitf8fnKFhUwCrSDZdAcyNwAQLtY4eJlbZDgfMZ?= =?utf-8?q?Ak9OhH80fFGiTSHBzDhjFNzYFWai5m60Oj5QCoSJTwFJcDCNgGlR6wuaTVXQRX5oJ?= =?utf-8?q?3aFN1Meui681kClCkkSOdw7q0fpuF2Hw=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1f36f1ab-8f0d-44cf-4b65-08da269793cc X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8616.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Apr 2022 08:42:48.9933 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: jsmPMr0Ur35rlFPrzoOZ5DbnsRGaLvtm54hMftZ30/8sLRqFmrAOKhr7z/nR7CubGlwix4WOz85X9hPsdgTFgw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR04MB6647 When a page table ends up with no present entries left, it can be replaced by a non-present entry at the next higher level. The page table itself can then be scheduled for freeing. Note that while its output isn't used there yet, pt_update_contig_markers() right away needs to be called in all places where entries get updated, not just the one where entries get cleared. Note further that while pt_update_contig_markers() updates perhaps several PTEs within the table, since these are changes to "avail" bits only I do not think that cache flushing would be needed afterwards. Such cache flushing (of entire pages, unless adding yet more logic to me more selective) would be quite noticable performance-wise (very prominent during Dom0 boot). Signed-off-by: Jan Beulich Reviewed-by: Kevin Tian Reviewed-by: Roger Pau Monné --- v4: Re-base over changes earlier in the series. v3: Properly bound loop. Re-base over changes earlier in the series. v2: New. --- The hang during boot on my Latitude E6410 (see the respective code comment) was pretty close after iommu_enable_translation(). No errors, no watchdog would kick in, just sometimes the first few pixel lines of the next log message's (XEN) prefix would have made it out to the screen (and there's no serial there). It's been a lot of experimenting until I figured the workaround (which I consider ugly, but halfway acceptable). I've been trying hard to make sure the workaround wouldn't be masking a real issue, yet I'm still wary of it possibly doing so ... My best guess at this point is that on these old IOMMUs the ignored bits 52...61 aren't really ignored for present entries, but also aren't "reserved" enough to trigger faults. This guess is from having tried to set other bits in this range (unconditionally, and with the workaround here in place), which yielded the same behavior. --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -43,6 +43,9 @@ #include "vtd.h" #include "../ats.h" +#define CONTIG_MASK DMA_PTE_CONTIG_MASK +#include + /* dom_io is used as a sentinel for quarantined devices */ #define QUARANTINE_SKIP(d, pgd_maddr) ((d) == dom_io && !(pgd_maddr)) #define DEVICE_DOMID(d, pdev) ((d) != dom_io ? (d)->domain_id \ @@ -405,6 +408,9 @@ static uint64_t addr_to_dma_page_maddr(s write_atomic(&pte->val, new_pte.val); iommu_sync_cache(pte, sizeof(struct dma_pte)); + pt_update_contig_markers(&parent->val, + address_level_offset(addr, level), + level, PTE_kind_table); } if ( --level == target ) @@ -837,9 +843,31 @@ static int dma_pte_clear_one(struct doma old = *pte; dma_clear_pte(*pte); + iommu_sync_cache(pte, sizeof(*pte)); + + while ( pt_update_contig_markers(&page->val, + address_level_offset(addr, level), + level, PTE_kind_null) && + ++level < min_pt_levels ) + { + struct page_info *pg = maddr_to_page(pg_maddr); + + unmap_vtd_domain_page(page); + + pg_maddr = addr_to_dma_page_maddr(domain, addr, level, flush_flags, + false); + BUG_ON(pg_maddr < PAGE_SIZE); + + page = map_vtd_domain_page(pg_maddr); + pte = &page[address_level_offset(addr, level)]; + dma_clear_pte(*pte); + iommu_sync_cache(pte, sizeof(*pte)); + + *flush_flags |= IOMMU_FLUSHF_all; + iommu_queue_free_pgtable(hd, pg); + } spin_unlock(&hd->arch.mapping_lock); - iommu_sync_cache(pte, sizeof(struct dma_pte)); unmap_vtd_domain_page(page); @@ -2182,8 +2210,21 @@ static int __must_check cf_check intel_i } *pte = new; - iommu_sync_cache(pte, sizeof(struct dma_pte)); + + /* + * While the (ab)use of PTE_kind_table here allows to save some work in + * the function, the main motivation for it is that it avoids a so far + * unexplained hang during boot (while preparing Dom0) on a Westmere + * based laptop. + */ + pt_update_contig_markers(&page->val, + address_level_offset(dfn_to_daddr(dfn), level), + level, + (hd->platform_ops->page_sizes & + (1UL << level_to_offset_bits(level + 1)) + ? PTE_kind_leaf : PTE_kind_table)); + spin_unlock(&hd->arch.mapping_lock); unmap_vtd_domain_page(page);