From patchwork Tue Jul 11 00:46:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Volodymyr Babchuk X-Patchwork-Id: 13307888 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 89B22EB64DA for ; Tue, 11 Jul 2023 00:46:58 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.561504.877929 (Exim 4.92) (envelope-from ) id 1qJ1WK-0001ld-C6; Tue, 11 Jul 2023 00:46:24 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 561504.877929; Tue, 11 Jul 2023 00:46:24 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qJ1WK-0001lW-9F; Tue, 11 Jul 2023 00:46:24 +0000 Received: by outflank-mailman (input) for mailman id 561504; Tue, 11 Jul 2023 00:46:23 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qJ1WI-0001lQ-Tx for xen-devel@lists.xenproject.org; Tue, 11 Jul 2023 00:46:23 +0000 Received: from mx0b-0039f301.pphosted.com (mx0b-0039f301.pphosted.com [148.163.137.242]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 59e3902f-1f84-11ee-b239-6b7b168915f2; Tue, 11 Jul 2023 02:46:20 +0200 (CEST) Received: from pps.filterd (m0174681.ppops.net [127.0.0.1]) by mx0b-0039f301.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36AG006M009547; Tue, 11 Jul 2023 00:46:09 GMT Received: from eur05-am6-obe.outbound.protection.outlook.com (mail-am6eur05lp2111.outbound.protection.outlook.com [104.47.18.111]) by mx0b-0039f301.pphosted.com (PPS) with ESMTPS id 3rrbn9unxr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 00:46:08 +0000 Received: from VI1PR03MB3710.eurprd03.prod.outlook.com (2603:10a6:803:31::18) by DB9PR03MB7193.eurprd03.prod.outlook.com (2603:10a6:10:227::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.30; Tue, 11 Jul 2023 00:46:05 +0000 Received: from VI1PR03MB3710.eurprd03.prod.outlook.com ([fe80::c192:26de:9053:ab05]) by VI1PR03MB3710.eurprd03.prod.outlook.com ([fe80::c192:26de:9053:ab05%6]) with mapi id 15.20.6565.028; Tue, 11 Jul 2023 00:46:04 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 59e3902f-1f84-11ee-b239-6b7b168915f2 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bJPJYcqxTCFVvHGHGvIiaYFGlyRS5xWVzdFnALeaLMbYD2O2wuJGP+H8axavqMQE9w2jX0H9P1UpwHA6k/EGVWd52lA5pQKnA4sFS2sp/tBjSG/ZHJ7A2dRjXHIdjyNAwqHBldROZhG4nlbhJULfSXGO+tFYcvAO3mtHdvp9rp47kJfVHKfrLxspCXyZYF7YaNW016UHM+6Y7fAtdqFYH365ogdL2qtLvJ9CojGbvAIsT5PhP9cdYfW0wgWoHW9fhooOXmWRdESuwOZD4v0azCI6DUj62OVcHg7HSMoOkNnel4Y9iwukNMQP/P6WOlkvBLYWVaeNGN4qDdsQ50WNvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Oq7MAyYNyIyiERDgN5/HwhrVdFHh73QKIalzw9eg63s=; b=JeNcy17B76YEwFSEQ5yGmEVQ77ujrMtzJCLKHmathUF/GJsTe9ZBVK0/K9qGGZjaaHlmh4fUfFLWf8tyaZCUOOmR9DauLBvbQ9hMFEl967gf63xqDJr7yGJHBIdl9E20MFA3ZRtVdNpeqRzvXtJKRL0gTYxNFFKFmjY8H0y2lXiQBaBeA0LaySSygy/MeTsojMGzrP4zuNrOaIljTPNkOQNWPZty1+b2NTJ0P2GheUHJo89tk44/v5Xvz9UvJWqo1eIh+Cfalx/yLzcmw4abBlfCpnov5YDM3KsL06PCSZm6NSDDZN3gfL+JgOnMUX+9ODPlifEQksqayU9QlPAplA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=epam.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Oq7MAyYNyIyiERDgN5/HwhrVdFHh73QKIalzw9eg63s=; b=cdRPTxIbiEFWjM19eB/RGnpHpGCkWp5sFnRhMZaUeP+8ACqLbHbQKcHcOLYPiPt1GuSj46knPBd5Ut9FerGsIi8wOW5/g9vWoMB5Re1J+VW1FQkZqSFrjQZ8ArQs/RWXTkRjP/iPljZvWiqWQLzOMEnJotPbHzhfgyGb1iyU3gQbdWwS7GCe4MW+qJLKxrSRGfVl0KFwkT28Ype9TrY8BWMLwQbl6hqOFUoTwhdSt5sluZGbHVd8IV3wFeTq4pDh6n+pustIG3epBFzDfgDG1QtMd3PAbFc60YuPo5aTRaZ6ka0/OjRZK52T2Y3NzHL+OaPatKhGTsgZdxW6lzMT1Q== From: Volodymyr Babchuk To: "xen-devel@lists.xenproject.org" CC: Volodymyr Babchuk , Jan Beulich , Andrew Cooper , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Wei Liu , George Dunlap , Julien Grall , Stefano Stabellini , Jun Nakajima , Kevin Tian , Paul Durrant , Oleksandr Tyshchenko , Oleksandr Andrushchenko Subject: [RFC PATCH] pci: introduce per-domain PCI rwlock Thread-Topic: [RFC PATCH] pci: introduce per-domain PCI rwlock Thread-Index: AQHZs5ESsjHhLtlMU0qzCbDGds6cOw== Date: Tue, 11 Jul 2023 00:46:04 +0000 Message-ID: <20230711004537.888185-1-volodymyr_babchuk@epam.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: git-send-email 2.41.0 x-ms-publictraffictype: Email x-ms-traffictypediagnostic: VI1PR03MB3710:EE_|DB9PR03MB7193:EE_ x-ms-office365-filtering-correlation-id: e1b96924-a563-4e8f-98b6-08db81a834bf x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: YjVXgUF16h9tQ7lIrxSeOcM5UsWLVJjSUAj/NpbRA56uzyLSkUMxV/9w96lVcQpEgyQDWtdBYlFSDlMpRUteib71etnsyD5Hegmkboe7gUwMjsSs0/zamG/cRDcpuoonqzqKtSSpoDHk6hohx2UkyJcp7GAdhB/X11ZxIcsRJDQfMzqI+9f/Nibi8bTJYGaN3DMhbYMDweqzA4umDCcT1RQQlWvc0c/sRi/9HPfzUybbIF0SW4xqpGxHuG0QzH0FPk5ygfQWgzbsd8PrmQEp+IpYGOgu2u13cd5WK0rfIbEFKGb4d8iOXVDvgRMbFsGw3I+W5g1NVGe3SoLFHUfwK2JOFVgEt2YTGGgZLIBmfRqbsjaKkRrGxYqqZof4tSGCkGDX6OgpuAQSeggLQPYBb0gMj5RWZrhJf9EUnxowku3yu5exWPtL52/kMbEW/R4sePpBY5OJSD6+s7etOJAfQtlkcCqVRq+fI6Gzwe7kuytcFlwHeLsNLjMthSeXRPpPaoUXUjoEGDDTY6s5IUby1snKaXwtIgu/YKWoLBiPA9xDlLBS6SqzajaQyqUUfh/AJcLW78QjOHDuUdxNM/ot5s9aqOKmgRFNYzdmrGIcEdJJwbAoK/NlRXTptwmUtuskwfe+WMB6gNHDDeFkdw5Itw== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR03MB3710.eurprd03.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(376002)(346002)(39860400002)(366004)(396003)(136003)(451199021)(186003)(6506007)(26005)(1076003)(55236004)(2616005)(6512007)(107886003)(83380400001)(64756008)(41300700001)(4326008)(30864003)(66446008)(66556008)(5660300002)(2906002)(7416002)(316002)(8936002)(8676002)(66476007)(6916009)(66946007)(478600001)(6486002)(76116006)(71200400001)(91956017)(54906003)(36756003)(122000001)(38070700005)(38100700002)(86362001)(357404004);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?q?TDyqOgdMFOAjFA5szN+6u0AkwAVt?= =?utf-8?q?/QEolbazlEZJ3kmplcyUb1LTTybd9GdQ1oeFq2Qp7v/+jL6r3cDzVRBr1GBYG2+XW?= =?utf-8?q?r94uOmbh6KK0Mfpd/FCuMZx6PfibsKhPEUhn6GBFI8wotKnINNzTNAoPjrOwKLAdb?= =?utf-8?q?hCzcGSSZ5QAbongJqpTD3ZBmaWEm39O4arQ5UWB4aH6jklX/CLMw6x98RiudMHew6?= =?utf-8?q?LYbxc+hPUbCqqUg7LVbp84DyjJcoCW9TWKMScgTcb2WfTIx8RBPodM1YX7b4Yc75e?= =?utf-8?q?7fKB0TWCKiTxqFjIOLf+R6ZbvG1uuSgnc5basQ+N+IH24AtKy6KeOUVs9zvajjoh8?= =?utf-8?q?lxQZ656lSCU+xFK4NKsouC4FiL4U4BGXOYtnnYRW85C41nmiAzokbdEMMLF+YgvN7?= =?utf-8?q?yNb+qaF4oRAOj+Mh/KzxS03OmiRvUvwIxrOJ0eNGLZAEArHHv0Q4jGTmFjfR+bhST?= =?utf-8?q?7ZV/MX+MNwZj5Q95icY6X4ui8p8SsQW8UE2U30uQo/jwlAh9C4xc4hP85j2/iuyX3?= =?utf-8?q?WC2ROl/36fmgaZYX7/KWMV0ttqet/1fvgr4m80TP1ZXfcdrg57+UfqHFMnhcZs9Qg?= =?utf-8?q?3vm9UO2OPlFicy9bff9GUWdueS2GeFB4LY16QMaB6xP/6OAG2/Z/ciYkljwMi/bcV?= =?utf-8?q?y8v5uVvGtUDeGW9atbKwYJqg+RU5egShUCR0tqzTawtU5L0r9sPk2YEDYcVAxI0/v?= =?utf-8?q?XUZ7bYqiBY75d0DsrftvGVkcfsPsAS0yfrZhTWsH3gNOoz6F8TgZmAah4fONHJueF?= =?utf-8?q?5YAgjb+8Ny7rMyPS9wtYhE3RT2vSfcxdkvoas6loKh5dmtFjHrqyfjdkqikRGbZba?= =?utf-8?q?U+Df76wUIOmnimdaI+fuGRZ0WA9JYJl21kLaI9cV96GoZxVGWixbjJPjaERXxxRgh?= =?utf-8?q?NlZ0/cPaX2iaQCEAbosQ0EHBqeTGh1VpY+tlI+xAHStzy9oT6VJOfPMiaasXwPF+O?= =?utf-8?q?pvhAMc/fjdDitgWVV1lnJy94nt9XUJ9RFzgnkrURaaPHhai1G9WcLGYZ2uHXLBuVY?= =?utf-8?q?tYd1XHDPnoJ16K5rJrJWM3iwxXwYX42eIKB1Gk8syTUbE20vl16x8IFx/DFK0w6EW?= =?utf-8?q?9vpogtf7M4FeRq4T3WkQIrCLPZsVKzII/WoXHsj2zNTqa2g0ohYt43iWmwMcxnTRs?= =?utf-8?q?iVYvdvvlGpuLsGLl8Q5qS3qs/aLHLqotawGOEXRTfLoidpzT4mQrZ9KKqOc+mtGtt?= =?utf-8?q?6FLiSzpdaSROAuZk7Lb1CnXdhuTmjEk0K4dcbRN0N8qIYowEKKlLi+rEmoEMkVyEj?= =?utf-8?q?SAe7bYR1ymsUxtHpQmwnw3tCbdoP6Vse868yiFxNMRQqrRjqK5WQ7V68tIpMxGIf1?= =?utf-8?q?AASU8QQfr1z4FLK7M2QXMYMEO49vBiEo/0AK7A4/NVmDzNsFH4qK5rCWILlJv4to+?= =?utf-8?q?A/94pp0W/1mmLUt9MZ7MLHqr3/AkWI0qJTwPRpOwe+TlNZ3qb3aXXK8p35+XBiRDB?= =?utf-8?q?AYBrvgZxm+DHmrFyWdAgGSuRpkqhcaGNyLNafkK4H0a1QEJVEDg9qFz5qhtab540L?= =?utf-8?q?o6TUyozAbCkcXp4cZHIoSa3HfCJPZyzxTQ=3D=3D?= Content-ID: MIME-Version: 1.0 X-OriginatorOrg: epam.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: VI1PR03MB3710.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: e1b96924-a563-4e8f-98b6-08db81a834bf X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Jul 2023 00:46:04.2340 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b41b72d0-4e9f-4c26-8a69-f949f367c91d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: +kA99Zyl2wv0zTc3ulwJ8f7iYlYWVzQdeBTQQZXhTo/nWgg85Ea7Mm4PIRiSlMg14JDxg5lotw5ELHSTDRKngAVk3X0ww94qK25I4p1P69Q= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR03MB7193 X-Proofpoint-ORIG-GUID: Kj1vvoxhZxKhCnqne2mzUBrbrS5hpoVo X-Proofpoint-GUID: Kj1vvoxhZxKhCnqne2mzUBrbrS5hpoVo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-10_18,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 bulkscore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 mlxscore=0 lowpriorityscore=0 malwarescore=0 clxscore=1011 suspectscore=0 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307110004 Add per-domain d->pci_lock that protects access to d->pdev_list. Purpose of this lock is to give guarantees to VPCI code that underlying pdev will not disappear under feet. Later it will also protect pdev->vpci structure and pdev access in modify_bars(). Suggested-by: Roger Pau Monné Suggested-by: Jan Beulich Signed-off-by: Volodymyr Babchuk --- This patch should be part of VPCI series, but I am posting it as a sinle-patch RFC to discuss changes to x86 MM and IOMMU code. I opted to factor out part of the changes from "vpci: introduce per-domain lock to protect vpci structure" commit to ease up review process. --- xen/arch/x86/hvm/hvm.c | 2 + xen/arch/x86/hvm/vmx/vmcs.c | 2 + xen/arch/x86/mm.c | 6 ++ xen/arch/x86/mm/p2m-pod.c | 7 +++ xen/arch/x86/mm/paging.c | 6 ++ xen/common/domain.c | 1 + xen/drivers/passthrough/amd/iommu_cmd.c | 4 +- xen/drivers/passthrough/amd/pci_amd_iommu.c | 15 ++++- xen/drivers/passthrough/pci.c | 70 +++++++++++++++++---- xen/drivers/passthrough/vtd/iommu.c | 19 +++++- xen/include/xen/sched.h | 1 + 11 files changed, 117 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index a67ef79dc0..089fbe38a7 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2381,12 +2381,14 @@ int hvm_set_cr0(unsigned long value, bool may_defer) } } + read_lock(&d->pci_lock); if ( ((value ^ old_value) & X86_CR0_CD) && is_iommu_enabled(d) && hvm_funcs.handle_cd && (!rangeset_is_empty(d->iomem_caps) || !rangeset_is_empty(d->arch.ioport_caps) || has_arch_pdevs(d)) ) alternative_vcall(hvm_funcs.handle_cd, v, value); + read_unlock(&d->pci_lock); hvm_update_cr(v, 0, value); diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index b209563625..88bbcbbd99 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1889,6 +1889,7 @@ void cf_check vmx_do_resume(void) * 2: execute wbinvd on all dirty pCPUs when guest wbinvd exits. * If VT-d engine can force snooping, we don't need to do these. */ + read_lock(&v->domain->pci_lock); if ( has_arch_pdevs(v->domain) && !iommu_snoop && !cpu_has_wbinvd_exiting ) { @@ -1896,6 +1897,7 @@ void cf_check vmx_do_resume(void) if ( cpu != -1 ) flush_mask(cpumask_of(cpu), FLUSH_CACHE); } + read_unlock(&v->domain->pci_lock); vmx_clear_vmcs(v); vmx_load_vmcs(v); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index be2b10a391..f1e882a980 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -858,12 +858,15 @@ get_page_from_l1e( return 0; } + read_lock(&l1e_owner->pci_lock); if ( unlikely(l1f & l1_disallow_mask(l1e_owner)) ) { gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n", l1f & l1_disallow_mask(l1e_owner)); + read_unlock(&l1e_owner->pci_lock); return -EINVAL; } + read_unlock(&l1e_owner->pci_lock); valid = mfn_valid(_mfn(mfn)); @@ -2142,12 +2145,15 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e, { struct page_info *page = NULL; + read_lock(&pt_dom->pci_lock); if ( unlikely(l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom)) ) { gdprintk(XENLOG_WARNING, "Bad L1 flags %x\n", l1e_get_flags(nl1e) & l1_disallow_mask(pt_dom)); + read_unlock(&pt_dom->pci_lock); return -EINVAL; } + read_unlock(&pt_dom->pci_lock); /* Translate foreign guest address. */ if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE && diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c index 9969eb45fa..07e0bedad7 100644 --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -349,10 +349,12 @@ p2m_pod_set_mem_target(struct domain *d, unsigned long target) ASSERT( pod_target >= p2m->pod.count ); + read_lock(&d->pci_lock); if ( has_arch_pdevs(d) || cache_flush_permitted(d) ) ret = -ENOTEMPTY; else ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/); + read_unlock(&d->pci_lock); out: pod_unlock(p2m); @@ -1401,8 +1403,13 @@ guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn, if ( !paging_mode_translate(d) ) return -EINVAL; + read_lock(&d->pci_lock); if ( has_arch_pdevs(d) || cache_flush_permitted(d) ) + { + read_unlock(&d->pci_lock); return -ENOTEMPTY; + } + read_unlock(&d->pci_lock); do { rc = mark_populate_on_demand(d, gfn, chunk_order); diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 34d833251b..fb8f7ff7cf 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -205,21 +205,27 @@ static int paging_log_dirty_enable(struct domain *d) { int ret; + read_lock(&d->pci_lock); if ( has_arch_pdevs(d) ) { /* * Refuse to turn on global log-dirty mode * if the domain is sharing the P2M with the IOMMU. */ + read_unlock(&d->pci_lock); return -EINVAL; } if ( paging_mode_log_dirty(d) ) + { + read_unlock(&d->pci_lock); return -EINVAL; + } domain_pause(d); ret = d->arch.paging.log_dirty.ops->enable(d); domain_unpause(d); + read_unlock(&d->pci_lock); return ret; } diff --git a/xen/common/domain.c b/xen/common/domain.c index caaa402637..5d8a8836da 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -645,6 +645,7 @@ struct domain *domain_create(domid_t domid, #ifdef CONFIG_HAS_PCI INIT_LIST_HEAD(&d->pdev_list); + rwlock_init(&d->pci_lock); #endif /* All error paths can depend on the above setup. */ diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c b/xen/drivers/passthrough/amd/iommu_cmd.c index 40ddf366bb..b67aee31f6 100644 --- a/xen/drivers/passthrough/amd/iommu_cmd.c +++ b/xen/drivers/passthrough/amd/iommu_cmd.c @@ -308,11 +308,12 @@ void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev, flush_command_buffer(iommu, iommu_dev_iotlb_timeout); } -static void amd_iommu_flush_all_iotlbs(const struct domain *d, daddr_t daddr, +static void amd_iommu_flush_all_iotlbs(struct domain *d, daddr_t daddr, unsigned int order) { struct pci_dev *pdev; + read_lock(&d->pci_lock); for_each_pdev( d, pdev ) { u8 devfn = pdev->devfn; @@ -323,6 +324,7 @@ static void amd_iommu_flush_all_iotlbs(const struct domain *d, daddr_t daddr, } while ( devfn != pdev->devfn && PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) ); } + read_unlock(&d->pci_lock); } /* Flush iommu cache after p2m changes. */ diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index 94e3775506..8541b66a93 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -102,6 +102,8 @@ static bool any_pdev_behind_iommu(const struct domain *d, { const struct pci_dev *pdev; + ASSERT(rw_is_locked(&d->pci_lock)); + for_each_pdev ( d, pdev ) { if ( pdev == exclude ) @@ -467,17 +469,24 @@ static int cf_check reassign_device( if ( !QUARANTINE_SKIP(target, pdev) ) { + read_lock(&target->pci_lock); rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev); if ( rc ) return rc; + read_unlock(&target->pci_lock); } else amd_iommu_disable_domain_device(source, iommu, devfn, pdev); if ( devfn == pdev->devfn && pdev->domain != target ) { - list_move(&pdev->domain_list, &target->pdev_list); - pdev->domain = target; + write_lock(&pdev->domain->pci_lock); + list_del(&pdev->domain_list); + write_unlock(&pdev->domain->pci_lock); + + write_lock(&target->pci_lock); + list_add(&pdev->domain_list, &target->pdev_list); + write_unlock(&target->pci_lock); } /* @@ -628,12 +637,14 @@ static int cf_check amd_iommu_add_device(u8 devfn, struct pci_dev *pdev) fresh_domid = true; } + read_lock(&pdev->domain->pci_lock); ret = amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev); if ( ret && fresh_domid ) { iommu_free_domid(pdev->arch.pseudo_domid, iommu->domid_map); pdev->arch.pseudo_domid = DOMID_INVALID; } + read_unlock(&pdev->domain->pci_lock); return ret; } diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index 07d1986d33..1831e1b0c0 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -454,7 +454,9 @@ static void __init _pci_hide_device(struct pci_dev *pdev) if ( pdev->domain ) return; pdev->domain = dom_xen; + write_lock(&dom_xen->pci_lock); list_add(&pdev->domain_list, &dom_xen->pdev_list); + write_unlock(&dom_xen->pci_lock); } int __init pci_hide_device(unsigned int seg, unsigned int bus, @@ -530,7 +532,7 @@ struct pci_dev *pci_get_pdev(const struct domain *d, pci_sbdf_t sbdf) { struct pci_dev *pdev; - ASSERT(d || pcidevs_locked()); + ASSERT((d && rw_is_locked(&d->pci_lock)) || pcidevs_locked()); /* * The hardware domain owns the majority of the devices in the system. @@ -748,7 +750,9 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, if ( !pdev->domain ) { pdev->domain = hardware_domain; + write_lock(&hardware_domain->pci_lock); list_add(&pdev->domain_list, &hardware_domain->pdev_list); + write_unlock(&hardware_domain->pci_lock); /* * For devices not discovered by Xen during boot, add vPCI handlers @@ -887,26 +891,62 @@ static int deassign_device(struct domain *d, uint16_t seg, uint8_t bus, int pci_release_devices(struct domain *d) { - struct pci_dev *pdev, *tmp; - u8 bus, devfn; - int ret; + int combined_ret; + LIST_HEAD(failed_pdevs); pcidevs_lock(); - ret = arch_pci_clean_pirqs(d); - if ( ret ) + write_lock(&d->pci_lock); + combined_ret = arch_pci_clean_pirqs(d); + if ( combined_ret ) { pcidevs_unlock(); - return ret; + write_unlock(&d->pci_lock); + return combined_ret; } - list_for_each_entry_safe ( pdev, tmp, &d->pdev_list, domain_list ) + + while ( !list_empty(&d->pdev_list) ) { - bus = pdev->bus; - devfn = pdev->devfn; - ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret; + struct pci_dev *pdev = list_first_entry(&d->pdev_list, + struct pci_dev, + domain_list); + uint16_t seg = pdev->seg; + uint8_t bus = pdev->bus; + uint8_t devfn = pdev->devfn; + int ret; + + write_unlock(&d->pci_lock); + ret = deassign_device(d, seg, bus, devfn); + write_lock(&d->pci_lock); + if ( ret ) + { + bool still_present = false; + const struct pci_dev *tmp; + + /* + * We need to check if deassign_device() left our pdev in + * domain's list. As we dropped the lock, we can't be sure + * that list wasn't permutated in some random way, so we + * need to traverse the whole list. + */ + for_each_pdev ( d, tmp ) + { + if ( tmp == pdev ) + { + still_present = true; + break; + } + } + if ( still_present ) + list_move(&pdev->domain_list, &failed_pdevs); + combined_ret = ret; + } } + + list_splice(&failed_pdevs, &d->pdev_list); + write_unlock(&d->pci_lock); pcidevs_unlock(); - return ret; + return combined_ret; } #define PCI_CLASS_BRIDGE_HOST 0x0600 @@ -1125,7 +1165,9 @@ static int __hwdom_init cf_check _setup_hwdom_pci_devices( if ( !pdev->domain ) { pdev->domain = ctxt->d; + write_lock(&ctxt->d->pci_lock); list_add(&pdev->domain_list, &ctxt->d->pdev_list); + write_unlock(&ctxt->d->pci_lock); setup_one_hwdom_device(ctxt, pdev); } else if ( pdev->domain == dom_xen ) @@ -1487,6 +1529,7 @@ static int iommu_get_device_group( return group_id; pcidevs_lock(); + read_lock(&d->pci_lock); for_each_pdev( d, pdev ) { unsigned int b = pdev->bus; @@ -1501,6 +1544,7 @@ static int iommu_get_device_group( sdev_id = iommu_call(ops, get_device_group_id, seg, b, df); if ( sdev_id < 0 ) { + read_unlock(&d->pci_lock); pcidevs_unlock(); return sdev_id; } @@ -1511,6 +1555,7 @@ static int iommu_get_device_group( if ( unlikely(copy_to_guest_offset(buf, i, &bdf, 1)) ) { + read_unlock(&d->pci_lock); pcidevs_unlock(); return -EFAULT; } @@ -1518,6 +1563,7 @@ static int iommu_get_device_group( } } + read_unlock(&d->pci_lock); pcidevs_unlock(); return i; diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 0e3062c820..6a36cc18fe 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -186,6 +186,8 @@ static bool any_pdev_behind_iommu(const struct domain *d, { const struct pci_dev *pdev; + ASSERT(rw_is_locked(&d->pci_lock)); + for_each_pdev ( d, pdev ) { const struct acpi_drhd_unit *drhd; @@ -2765,6 +2767,7 @@ static int cf_check reassign_device_ownership( if ( !QUARANTINE_SKIP(target, pdev->arch.vtd.pgd_maddr) ) { + read_lock(&target->pci_lock); if ( !has_arch_pdevs(target) ) vmx_pi_hooks_assign(target); @@ -2780,21 +2783,26 @@ static int cf_check reassign_device_ownership( #endif ret = domain_context_mapping(target, devfn, pdev); + read_unlock(&target->pci_lock); if ( !ret && pdev->devfn == devfn && !QUARANTINE_SKIP(source, pdev->arch.vtd.pgd_maddr) ) { const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev); + read_lock(&source->pci_lock); if ( drhd ) check_cleanup_domid_map(source, pdev, drhd->iommu); + read_unlock(&source->pci_lock); } } else { const struct acpi_drhd_unit *drhd; + read_lock(&source->pci_lock); drhd = domain_context_unmap(source, devfn, pdev); + read_unlock(&source->pci_lock); ret = IS_ERR(drhd) ? PTR_ERR(drhd) : 0; } if ( ret ) @@ -2806,12 +2814,21 @@ static int cf_check reassign_device_ownership( if ( devfn == pdev->devfn && pdev->domain != target ) { - list_move(&pdev->domain_list, &target->pdev_list); + write_lock(&pdev->domain->pci_lock); + list_del(&pdev->domain_list); + write_unlock(&pdev->domain->pci_lock); + + write_lock(&target->pci_lock); + list_add(&pdev->domain_list, &target->pdev_list); + write_unlock(&target->pci_lock); + pdev->domain = target; } + read_lock(&source->pci_lock); if ( !has_arch_pdevs(source) ) vmx_pi_hooks_deassign(source); + read_unlock(&source->pci_lock); /* * If the device belongs to the hardware domain, and it has RMRR, don't diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 85242a73d3..80dd150bbf 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -460,6 +460,7 @@ struct domain #ifdef CONFIG_HAS_PCI struct list_head pdev_list; + rwlock_t pci_lock; #endif #ifdef CONFIG_HAS_PASSTHROUGH