From patchwork Thu Oct 21 15:23:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2428C433F5 for ; Thu, 21 Oct 2021 15:23:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A9C9960C49 for ; Thu, 21 Oct 2021 15:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231911AbhJUP0L (ORCPT ); Thu, 21 Oct 2021 11:26:11 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:8522 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231624AbhJUP0K (ORCPT ); Thu, 21 Oct 2021 11:26:10 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBJ9o022350; Thu, 21 Oct 2021 11:23:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=QpuFna1xX5sy6xRT97SECUImIxZyeGI772b9K7FKfDc=; b=qX3ct10BTVqBYFax7V3cgxF+zs2x/I6kYfhlqcY74cT5hJB28Lo8LTuEZ8i93566pORi g8WIN6aLJ23yJ8DM3Bqdgiq75FksNyo8k/WcIykfiw1mwDZzMJ2QW95f+t0jGRorXnc6 dMsUTwC7cu61jRtt7HV8irKE/ddROlak0ArMG7RokE2yCXPJWLUM8tmsKe8m2qwiNcrg 0xBfKtMAVpQ/Z8PYPeReb0uJk48ZY9TXxw5fINjLOQYSqe+4hv+tm/i+KH2xYYy9QGD+ tPazHrA4of29XpMa1fhOhYzTMyDPXDyaqh5rCUSFJNpy+i4QESY9vgxQp6Z4zxNBzeDg ZA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bty9eqrxc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:52 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFChIn009134; Thu, 21 Oct 2021 11:23:52 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bty9eqrwn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:52 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF49bY028003; Thu, 21 Oct 2021 15:23:50 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma05wdc.us.ibm.com with ESMTP id 3bqpccyq6u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:23:50 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFNnvP27787606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:23:49 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5A291BE051; Thu, 21 Oct 2021 15:23:49 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4E42BE05B; Thu, 21 Oct 2021 15:23:44 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:44 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 01/15] s390/vfio-ap: Set pqap hook when vfio_ap module is loaded Date: Thu, 21 Oct 2021 11:23:18 -0400 Message-Id: <20211021152332.70455-2-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: basTLTTw5JCvrp3mfTGEeXXaLqKx3vxs X-Proofpoint-GUID: PZB5RJa2rrH50o1_ierJJzataucwyNwp X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 priorityscore=1501 malwarescore=0 spamscore=0 phishscore=0 lowpriorityscore=0 mlxlogscore=999 bulkscore=0 impostorscore=0 clxscore=1015 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Rather than storing the function pointer to the PQAP(AQIC) instruction interception handler with the mediated device (struct ap_matrix_mdev) when the vfio_ap device driver is notified that the KVM point is being set, let's store it once in a global variable when the vfio_ap module is loaded. There are three reasons for doing this: 1. The lifetime of the interception handler function coincides with the lifetime of the vfio_ap module, so it makes little sense to tie it to the mediated device and complete sense to tie it to the module in which the function resides. 2. The setting/clearing of the function pointer is protected by a mutex lock. This increases the number of locks taken during VFIO_GROUP_NOTIFY_SET_KVM notification and increases the complexity of ensuring locking integrity and avoiding circular lock dependencies. 3. The lock will only be taken for writing twice: When the vfio_ap module is loaded; and, when the vfio_ap module is removed. As it stands now, the lock is taken for writing whenever a guest is started or terminated. Signed-off-by: Tony Krowiak --- arch/s390/include/asm/kvm_host.h | 10 ++++-- arch/s390/kvm/kvm-s390.c | 1 - arch/s390/kvm/priv.c | 45 ++++++++++++++++++++++----- drivers/s390/crypto/vfio_ap_ops.c | 27 ++++++++-------- drivers/s390/crypto/vfio_ap_private.h | 1 - 5 files changed, 58 insertions(+), 26 deletions(-) diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index a604d51acfc8..05569d077d7f 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -799,16 +799,17 @@ struct kvm_s390_cpu_model { unsigned short ibc; }; -typedef int (*crypto_hook)(struct kvm_vcpu *vcpu); +struct kvm_s390_crypto_hook { + int (*fcn)(struct kvm_vcpu *vcpu); +}; struct kvm_s390_crypto { struct kvm_s390_crypto_cb *crycb; - struct rw_semaphore pqap_hook_rwsem; - crypto_hook *pqap_hook; __u32 crycbd; __u8 aes_kw; __u8 dea_kw; __u8 apie; + void *data; }; #define APCB0_MASK_SIZE 1 @@ -998,6 +999,9 @@ extern char sie_exit; extern int kvm_s390_gisc_register(struct kvm *kvm, u32 gisc); extern int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc); +extern int kvm_s390_pqap_hook_register(struct kvm_s390_crypto_hook *hook); +extern int kvm_s390_pqap_hook_unregister(struct kvm_s390_crypto_hook *hook); + static inline void kvm_arch_hardware_disable(void) {} static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 6a6dd5e1daf6..c91981599328 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2649,7 +2649,6 @@ static void kvm_s390_crypto_init(struct kvm *kvm) { kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb; kvm_s390_set_crycb_format(kvm); - init_rwsem(&kvm->arch.crypto.pqap_hook_rwsem); if (!test_kvm_facility(kvm, 76)) return; diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c index 53da4ceb16a3..3d91ff934c0c 100644 --- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c @@ -31,6 +31,39 @@ #include "kvm-s390.h" #include "trace.h" +DEFINE_MUTEX(pqap_hook_lock); +static struct kvm_s390_crypto_hook *pqap_hook; + +int kvm_s390_pqap_hook_register(struct kvm_s390_crypto_hook *hook) +{ + int ret = 0; + + mutex_lock(&pqap_hook_lock); + if (pqap_hook) + ret = -EACCES; + else + pqap_hook = hook; + mutex_unlock(&pqap_hook_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(kvm_s390_pqap_hook_register); + +int kvm_s390_pqap_hook_unregister(struct kvm_s390_crypto_hook *hook) +{ + int ret = 0; + + mutex_lock(&pqap_hook_lock); + if (hook != pqap_hook) + ret = -EACCES; + else + pqap_hook = NULL; + mutex_unlock(&pqap_hook_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(kvm_s390_pqap_hook_unregister); + static int handle_ri(struct kvm_vcpu *vcpu) { vcpu->stat.instruction_ri++; @@ -610,7 +643,6 @@ static int handle_io_inst(struct kvm_vcpu *vcpu) static int handle_pqap(struct kvm_vcpu *vcpu) { struct ap_queue_status status = {}; - crypto_hook pqap_hook; unsigned long reg0; int ret; uint8_t fc; @@ -659,16 +691,15 @@ static int handle_pqap(struct kvm_vcpu *vcpu) * hook function pointer in the kvm_s390_crypto structure. Lock the * owner, retrieve the hook function pointer and call the hook. */ - down_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem); - if (vcpu->kvm->arch.crypto.pqap_hook) { - pqap_hook = *vcpu->kvm->arch.crypto.pqap_hook; - ret = pqap_hook(vcpu); + mutex_lock(&pqap_hook_lock); + if (pqap_hook) { + ret = pqap_hook->fcn(vcpu); if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000) kvm_s390_set_psw_cc(vcpu, 3); - up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem); + mutex_unlock(&pqap_hook_lock); return ret; } - up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem); + mutex_unlock(&pqap_hook_lock); /* * A vfio_driver must register a hook. * No hook means no driver to enable the SIE CRYCB and no queues. diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 94c1c9bd58ad..02275d246b39 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -293,13 +293,10 @@ static int handle_pqap(struct kvm_vcpu *vcpu) apqn = vcpu->run->s.regs.gprs[0] & 0xffff; mutex_lock(&matrix_dev->lock); - if (!vcpu->kvm->arch.crypto.pqap_hook) - goto out_unlock; - matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook, - struct ap_matrix_mdev, pqap_hook); + matrix_mdev = vcpu->kvm->arch.crypto.data; /* If the there is no guest using the mdev, there is nothing to do */ - if (!matrix_mdev->kvm) + if (!matrix_mdev || !matrix_mdev->kvm) goto out_unlock; q = vfio_ap_get_queue(matrix_mdev, apqn); @@ -348,7 +345,6 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev) matrix_mdev->mdev = mdev; vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix); - matrix_mdev->pqap_hook = handle_pqap; mutex_lock(&matrix_dev->lock); list_add(&matrix_mdev->node, &matrix_dev->mdev_list); mutex_unlock(&matrix_dev->lock); @@ -1078,10 +1074,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, struct ap_matrix_mdev *m; if (kvm->arch.crypto.crycbd) { - down_write(&kvm->arch.crypto.pqap_hook_rwsem); - kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook; - up_write(&kvm->arch.crypto.pqap_hook_rwsem); - mutex_lock(&kvm->lock); mutex_lock(&matrix_dev->lock); @@ -1095,6 +1087,7 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, kvm_get_kvm(kvm); matrix_mdev->kvm = kvm; + kvm->arch.crypto.data = matrix_mdev; kvm_arch_crypto_set_masks(kvm, matrix_mdev->matrix.apm, matrix_mdev->matrix.aqm, @@ -1155,16 +1148,13 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev, struct kvm *kvm) { if (kvm && kvm->arch.crypto.crycbd) { - down_write(&kvm->arch.crypto.pqap_hook_rwsem); - kvm->arch.crypto.pqap_hook = NULL; - up_write(&kvm->arch.crypto.pqap_hook_rwsem); - mutex_lock(&kvm->lock); mutex_lock(&matrix_dev->lock); kvm_arch_crypto_clear_masks(kvm); vfio_ap_mdev_reset_queues(matrix_mdev); kvm_put_kvm(kvm); + kvm->arch.crypto.data = NULL; matrix_mdev->kvm = NULL; mutex_unlock(&kvm->lock); @@ -1391,12 +1381,20 @@ static const struct mdev_parent_ops vfio_ap_matrix_ops = { .supported_type_groups = vfio_ap_mdev_type_groups, }; +static struct kvm_s390_crypto_hook pqap_hook = { + .fcn = handle_pqap, +}; + int vfio_ap_mdev_register(void) { int ret; atomic_set(&matrix_dev->available_instances, MAX_ZDEV_ENTRIES_EXT); + ret = kvm_s390_pqap_hook_register(&pqap_hook); + if (ret) + return ret; + ret = mdev_register_driver(&vfio_ap_matrix_driver); if (ret) return ret; @@ -1413,6 +1411,7 @@ int vfio_ap_mdev_register(void) void vfio_ap_mdev_unregister(void) { + WARN_ON(kvm_s390_pqap_hook_unregister(&pqap_hook)); mdev_unregister_device(&matrix_dev->device); mdev_unregister_driver(&vfio_ap_matrix_driver); } diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 648fcaf8104a..907f41160de7 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -97,7 +97,6 @@ struct ap_matrix_mdev { struct notifier_block group_notifier; struct notifier_block iommu_notifier; struct kvm *kvm; - crypto_hook pqap_hook; struct mdev_device *mdev; }; From patchwork Thu Oct 21 15:23:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9474C433EF for ; Thu, 21 Oct 2021 15:24:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F2B260C49 for ; Thu, 21 Oct 2021 15:24:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231945AbhJUP0Y (ORCPT ); Thu, 21 Oct 2021 11:26:24 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:57722 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231949AbhJUP0R (ORCPT ); Thu, 21 Oct 2021 11:26:17 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBMdj003711; Thu, 21 Oct 2021 11:23:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=UjzL5IT9oscSWd5cGZj5JMb5Z95F8fJs2wvedapJfQc=; b=S9OaIy84jUU+46+sCF010mOLugYqKO+BCUPBRKv14tlP/ODO85sBw6Ar0mLSD0HrvaJ7 /H4cLvEJMkxfSDDRt1077JUhrk43W+WX3B16Ch0+O1PNEdbbfhtYWh4eaVJdgYusegm1 d1VijbSOQTBmht8a3BpFgvMHVeGwFUDoHs4NPssg6Uc8XhoMaZCLYwt+v8gnwjwR5ibC W84qC5XKanBbjM8MmMf6l7xmatrIMuMLrKHkZklSA4f7I68e8U0KbzhS2nveSuZigHAc P3D10L6hPPg35WDDxbisSVH/P0RNUjSAcZ9ba0h2+kmq07Zx97LapIwIpjsH2XZ7M9eI wQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8kkksa5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:55 -0400 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFJG0E010601; Thu, 21 Oct 2021 11:23:54 -0400 Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8kkks9x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:54 -0400 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF47VX011122; Thu, 21 Oct 2021 15:23:53 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma03wdc.us.ibm.com with ESMTP id 3bt4st18dc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:23:53 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFNqvj36962664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:23:52 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7963DBE05D; Thu, 21 Oct 2021 15:23:52 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE429BE061; Thu, 21 Oct 2021 15:23:49 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:49 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 02/15] s390/vfio-ap: use new AP bus interface to search for queue devices Date: Thu, 21 Oct 2021 11:23:19 -0400 Message-Id: <20211021152332.70455-3-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: pX1oti3Tt5i8CnabncTxV88lUVGY5YMT X-Proofpoint-GUID: pNwyOrjhPlYjK56ym30Pw7V8UIWJZJpm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 bulkscore=0 malwarescore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 mlxscore=0 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch refactors the vfio_ap device driver to use the AP bus's ap_get_qdev() function to retrieve the vfio_ap_queue struct containing information about a queue that is bound to the vfio_ap device driver. The bus's ap_get_qdev() function retrieves the queue device from a hashtable keyed by APQN. This is much more efficient than looping over the list of devices attached to the AP bus by several orders of magnitude. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic Reviewed-by: Jason J. Herne --- drivers/s390/crypto/vfio_ap_ops.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 02275d246b39..7a9d1b04ecd4 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -28,13 +28,6 @@ static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev); static struct vfio_ap_queue *vfio_ap_find_queue(int apqn); static const struct vfio_device_ops vfio_ap_matrix_dev_ops; -static int match_apqn(struct device *dev, const void *data) -{ - struct vfio_ap_queue *q = dev_get_drvdata(dev); - - return (q->apqn == *(int *)(data)) ? 1 : 0; -} - /** * vfio_ap_get_queue - retrieve a queue with a specific APQN from a list * @matrix_mdev: the associated mediated matrix @@ -1183,15 +1176,17 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, static struct vfio_ap_queue *vfio_ap_find_queue(int apqn) { - struct device *dev; + struct ap_queue *queue; struct vfio_ap_queue *q = NULL; - dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL, - &apqn, match_apqn); - if (dev) { - q = dev_get_drvdata(dev); - put_device(dev); - } + queue = ap_get_qdev(apqn); + if (!queue) + return NULL; + + if (queue->ap_dev.device.driver == &matrix_dev->vfio_ap_drv->driver) + q = dev_get_drvdata(&queue->ap_dev.device); + + put_device(&queue->ap_dev.device); return q; } From patchwork Thu Oct 21 15:23:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25141C433FE for ; Thu, 21 Oct 2021 15:24:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0DED461130 for ; Thu, 21 Oct 2021 15:24:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231982AbhJUP0V (ORCPT ); Thu, 21 Oct 2021 11:26:21 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52658 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231941AbhJUP0R (ORCPT ); Thu, 21 Oct 2021 11:26:17 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBKhi028384; Thu, 21 Oct 2021 11:23:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=1efW7U7mIAEiw5wxlXvRq+FFw7s4wdylizeIXJP9CrE=; b=JAcZK3QmHfXXTAu6t/yeXqfPwtGYmFbyumIbf7GNle1ft7bAhP0vrRGxizUDM5guYa0W lxmcgPm5BD8aivQi11ALAxEn4gLgePxNRvNxPJnyOWuufz1cG+GBcCDOquAVF+Zyv7wN kF+hdMpqRaqbainkP8vRdcOCZ5PccIo6lNXtoGypeaCjZSww4QK846WNpb+jN1BnToUH 4UICZGN2urfInVsKIklY1idm1y/WZ+JkyTcNBVoXml4ZcM5zIJDAw6RCWBe2Nf9vj/dG Q/k3sFjBqzoA1PD66VyklpY8qS7xZFkR18WT1sp8sVt4lttls2EYvXbYO0+QpkNWeNlO mQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9fmtc97-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:59 -0400 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEaoKc031473; Thu, 21 Oct 2021 11:23:59 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9fmtc8y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:23:59 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF2LZP000335; Thu, 21 Oct 2021 15:23:57 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma01dal.us.ibm.com with ESMTP id 3bqpcdr70e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:23:57 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFNt3j24052424 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:23:55 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 488ADBE082; Thu, 21 Oct 2021 15:23:55 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A61CDBE086; Thu, 21 Oct 2021 15:23:52 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:52 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 03/15] s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c Date: Thu, 21 Oct 2021 11:23:20 -0400 Message-Id: <20211021152332.70455-4-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: w3XTG6Ii5iQhkGUs3Qg28kOa1feZ08Nk X-Proofpoint-GUID: z994tMzCx4Lb2VyFw7KxdV46Q_c6wM-z X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 adultscore=0 clxscore=1015 suspectscore=0 malwarescore=0 spamscore=0 bulkscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's move the probe and remove callbacks into the vfio_ap_ops.c file to keep all code related to managing queues in a single file. This way, all functions related to queue management can be removed from the vfio_ap_private.h header file defining the public interfaces for the vfio_ap device driver. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic --- drivers/s390/crypto/vfio_ap_drv.c | 45 ++------------------------- drivers/s390/crypto/vfio_ap_ops.c | 31 ++++++++++++++++-- drivers/s390/crypto/vfio_ap_private.h | 5 +-- 3 files changed, 34 insertions(+), 47 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index 03311a476366..5255e338591d 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -41,50 +41,9 @@ static struct ap_device_id ap_queue_ids[] = { MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids); -/** - * vfio_ap_queue_dev_probe: Allocate a vfio_ap_queue structure and associate it - * with the device as driver_data. - * - * @apdev: the AP device being probed - * - * Return: returns 0 if the probe succeeded; otherwise, returns -ENOMEM if - * storage could not be allocated for a vfio_ap_queue object. - */ -static int vfio_ap_queue_dev_probe(struct ap_device *apdev) -{ - struct vfio_ap_queue *q; - - q = kzalloc(sizeof(*q), GFP_KERNEL); - if (!q) - return -ENOMEM; - dev_set_drvdata(&apdev->device, q); - q->apqn = to_ap_queue(&apdev->device)->qid; - q->saved_isc = VFIO_AP_ISC_INVALID; - return 0; -} - -/** - * vfio_ap_queue_dev_remove: Free the associated vfio_ap_queue structure. - * - * @apdev: the AP device being removed - * - * Takes the matrix lock to avoid actions on this device while doing the remove. - */ -static void vfio_ap_queue_dev_remove(struct ap_device *apdev) -{ - struct vfio_ap_queue *q; - - mutex_lock(&matrix_dev->lock); - q = dev_get_drvdata(&apdev->device); - vfio_ap_mdev_reset_queue(q, 1); - dev_set_drvdata(&apdev->device, NULL); - kfree(q); - mutex_unlock(&matrix_dev->lock); -} - static struct ap_driver vfio_ap_drv = { - .probe = vfio_ap_queue_dev_probe, - .remove = vfio_ap_queue_dev_remove, + .probe = vfio_ap_mdev_probe_queue, + .remove = vfio_ap_mdev_remove_queue, .ids = ap_queue_ids, }; diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 7a9d1b04ecd4..cd2fe1586327 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1191,8 +1191,7 @@ static struct vfio_ap_queue *vfio_ap_find_queue(int apqn) return q; } -int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, - unsigned int retry) +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) { struct ap_queue_status status; int ret; @@ -1410,3 +1409,31 @@ void vfio_ap_mdev_unregister(void) mdev_unregister_device(&matrix_dev->device); mdev_unregister_driver(&vfio_ap_matrix_driver); } + +int vfio_ap_mdev_probe_queue(struct ap_device *apdev) +{ + struct vfio_ap_queue *q; + + q = kzalloc(sizeof(*q), GFP_KERNEL); + if (!q) + return -ENOMEM; + mutex_lock(&matrix_dev->lock); + q->apqn = to_ap_queue(&apdev->device)->qid; + q->saved_isc = VFIO_AP_ISC_INVALID; + dev_set_drvdata(&apdev->device, q); + mutex_unlock(&matrix_dev->lock); + + return 0; +} + +void vfio_ap_mdev_remove_queue(struct ap_device *apdev) +{ + struct vfio_ap_queue *q; + + mutex_lock(&matrix_dev->lock); + q = dev_get_drvdata(&apdev->device); + vfio_ap_mdev_reset_queue(q, 1); + dev_set_drvdata(&apdev->device, NULL); + kfree(q); + mutex_unlock(&matrix_dev->lock); +} diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 907f41160de7..e3f0d42b094c 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -118,7 +118,8 @@ struct vfio_ap_queue { int vfio_ap_mdev_register(void); void vfio_ap_mdev_unregister(void); -int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, - unsigned int retry); + +int vfio_ap_mdev_probe_queue(struct ap_device *queue); +void vfio_ap_mdev_remove_queue(struct ap_device *queue); #endif /* _VFIO_AP_PRIVATE_H_ */ From patchwork Thu Oct 21 15:23:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF09AC433FE for ; Thu, 21 Oct 2021 15:24:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C87AB611C7 for ; Thu, 21 Oct 2021 15:24:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231739AbhJUP0Z (ORCPT ); Thu, 21 Oct 2021 11:26:25 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:29538 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231951AbhJUP0S (ORCPT ); Thu, 21 Oct 2021 11:26:18 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEfcmn001118; Thu, 21 Oct 2021 11:24:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=guj7DQjYahMr7+pw95i+PDKe6q7WDeVkfGD87dfYR6E=; b=JsPJJ91Yny+fjutjQCu2xkzIbaaFjTuWbESbWScjcUsUzuYImmnvpt+rYqhMQ31o7kUD CDmwxrqGakn9kFjI9XvSG6mVs1iJtYGrkPdBtTDvFsDxIOb8MppokVzOpKm0PklLuk64 GsvzvzKwH6xnj1b6tJ0eS83z/8PySGovT7fQyfMnpQLaLNBOHhyXh7FWkH8qKq2QpJMV ZRnW5mU/2FvcQMaxNvz1IOib/UZkHzl7e+r+2AgWmT+273oUmUgUiNNN6AWmVy+/nRSS e7Hm/53R+DodSyUKgYdasiiI13e+JHpCTZZKtTsVjStrisq3ImT7afXqjAjAT1zinfyB Pw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu7nedc97-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:01 -0400 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEBJxo015206; Thu, 21 Oct 2021 11:24:01 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu7nedc8r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:00 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF42n0009609; Thu, 21 Oct 2021 15:23:59 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma01wdc.us.ibm.com with ESMTP id 3bqpccfqmx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:23:59 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFNwxq36962768 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:23:58 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F38FFBE086; Thu, 21 Oct 2021 15:23:57 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7C9D5BE065; Thu, 21 Oct 2021 15:23:55 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:55 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 04/15] s390/vfio-ap: manage link between queue struct and matrix mdev Date: Thu, 21 Oct 2021 11:23:21 -0400 Message-Id: <20211021152332.70455-5-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: b3PtL-R6nRsb1hySEkEMxPTLUEbFxgVD X-Proofpoint-GUID: Ggep9fvyai4c-zwe7pWnANZ6cwoovQf7 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 mlxscore=0 clxscore=1015 priorityscore=1501 impostorscore=0 malwarescore=0 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's create links between each queue device bound to the vfio_ap device driver and the matrix mdev to which the queue's APQN is assigned. The idea is to facilitate efficient retrieval of the objects representing the queue devices and matrix mdevs as well as to verify that a queue assigned to a matrix mdev is bound to the driver. The links will be created as follows: * When the queue device is probed, if its APQN is assigned to a matrix mdev, the structures representing the queue device and the matrix mdev will be linked. * When an adapter or domain is assigned to a matrix mdev, for each new APQN assigned that references a queue device bound to the vfio_ap device driver, the structures representing the queue device and the matrix mdev will be linked. The links will be removed as follows: * When the queue device is removed, if its APQN is assigned to a matrix mdev, the link from the structure representing the matrix mdev to the structure representing the queue will be removed. The link from the queue to the matrix mdev will be maintained because if the queue device is being removed due to a manual sysfs unbind, it may be needed after the queue is reset to clean up the IRQ resources allocated to enable AP interrupts for the KVM guest. Since the storage for the structure representing the queue device is ultimately freed by the remove callback, keeping the reference shouldn't be a problem. * When an adapter or domain is unassigned from a matrix mdev, for each APQN unassigned that references a queue device bound to the vfio_ap device driver, the structures representing the queue device and the matrix mdev will be unlinked. * When an mdev is removed, the link from any queues assigned to the mdev to the mdev will be removed. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic --- drivers/s390/crypto/vfio_ap_ops.c | 192 +++++++++++++++++++++----- drivers/s390/crypto/vfio_ap_private.h | 14 ++ 2 files changed, 169 insertions(+), 37 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index cd2fe1586327..e2be29e9d310 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -29,32 +29,27 @@ static struct vfio_ap_queue *vfio_ap_find_queue(int apqn); static const struct vfio_device_ops vfio_ap_matrix_dev_ops; /** - * vfio_ap_get_queue - retrieve a queue with a specific APQN from a list - * @matrix_mdev: the associated mediated matrix - * @apqn: The queue APQN + * vfio_ap_mdev_get_queue - retrieve a queue with a specific APQN from a + * hash table of queues assigned to a matrix mdev + * @matrix_mdev: the matrix mdev + * @apqn: The APQN of a queue device * - * Retrieve a queue with a specific APQN from the list of the - * devices of the vfio_ap_drv. - * Verify that the APID and the APQI are set in the matrix. - * - * Return: the pointer to the associated vfio_ap_queue + * Return: the pointer to the vfio_ap_queue struct representing the queue or + * NULL if the queue is not assigned to @matrix_mdev */ -static struct vfio_ap_queue *vfio_ap_get_queue( +static struct vfio_ap_queue *vfio_ap_mdev_get_queue( struct ap_matrix_mdev *matrix_mdev, int apqn) { struct vfio_ap_queue *q; - if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm)) - return NULL; - if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm)) - return NULL; - - q = vfio_ap_find_queue(apqn); - if (q) - q->matrix_mdev = matrix_mdev; + hash_for_each_possible(matrix_mdev->qtable.queues, q, mdev_qnode, + apqn) { + if (q && q->apqn == apqn) + return q; + } - return q; + return NULL; } /** @@ -172,7 +167,6 @@ static struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q) status.response_code); end_free: vfio_ap_free_aqic_resources(q); - q->matrix_mdev = NULL; return status; } @@ -292,7 +286,7 @@ static int handle_pqap(struct kvm_vcpu *vcpu) if (!matrix_mdev || !matrix_mdev->kvm) goto out_unlock; - q = vfio_ap_get_queue(matrix_mdev, apqn); + q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); if (!q) goto out_unlock; @@ -338,6 +332,8 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev) matrix_mdev->mdev = mdev; vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix); + hash_init(matrix_mdev->qtable.queues); + mdev_set_drvdata(mdev, matrix_mdev); mutex_lock(&matrix_dev->lock); list_add(&matrix_mdev->node, &matrix_dev->mdev_list); mutex_unlock(&matrix_dev->lock); @@ -359,6 +355,55 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev) return ret; } +static void vfio_ap_mdev_link_queue(struct ap_matrix_mdev *matrix_mdev, + struct vfio_ap_queue *q) +{ + if (q) { + q->matrix_mdev = matrix_mdev; + hash_add(matrix_mdev->qtable.queues, &q->mdev_qnode, q->apqn); + } +} + +static void vfio_ap_mdev_link_apqn(struct ap_matrix_mdev *matrix_mdev, int apqn) +{ + struct vfio_ap_queue *q; + + q = vfio_ap_find_queue(apqn); + vfio_ap_mdev_link_queue(matrix_mdev, q); +} + +static void vfio_ap_unlink_queue_fr_mdev(struct vfio_ap_queue *q) +{ + hash_del(&q->mdev_qnode); +} + +static void vfio_ap_unlink_mdev_fr_queue(struct vfio_ap_queue *q) +{ + q->matrix_mdev = NULL; +} + +static void vfio_ap_mdev_unlink_queue(struct vfio_ap_queue *q) +{ + vfio_ap_unlink_queue_fr_mdev(q); + vfio_ap_unlink_mdev_fr_queue(q); +} + +static void vfio_ap_mdev_unlink_fr_queues(struct ap_matrix_mdev *matrix_mdev) +{ + struct vfio_ap_queue *q; + unsigned long apid, apqi; + + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) { + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, + AP_DOMAINS) { + q = vfio_ap_mdev_get_queue(matrix_mdev, + AP_MKQID(apid, apqi)); + if (q) + q->matrix_mdev = NULL; + } + } +} + static void vfio_ap_mdev_remove(struct mdev_device *mdev) { struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(&mdev->dev); @@ -367,6 +412,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev) mutex_lock(&matrix_dev->lock); vfio_ap_mdev_reset_queues(matrix_mdev); + vfio_ap_mdev_unlink_fr_queues(matrix_mdev); list_del(&matrix_mdev->node); mutex_unlock(&matrix_dev->lock); vfio_uninit_group_dev(&matrix_mdev->vdev); @@ -575,6 +621,16 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev) return 0; } +static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid) +{ + unsigned long apqi; + + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) + vfio_ap_mdev_link_apqn(matrix_mdev, + AP_MKQID(apid, apqi)); +} + /** * assign_adapter_store - parses the APID from @buf and sets the * corresponding bit in the mediated matrix device's APM @@ -645,6 +701,7 @@ static ssize_t assign_adapter_store(struct device *dev, if (ret) goto share_err; + vfio_ap_mdev_link_adapter(matrix_mdev, apid); ret = count; goto done; @@ -657,6 +714,20 @@ static ssize_t assign_adapter_store(struct device *dev, } static DEVICE_ATTR_WO(assign_adapter); +static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid) +{ + unsigned long apqi; + struct vfio_ap_queue *q; + + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) { + q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + + if (q) + vfio_ap_mdev_unlink_queue(q); + } +} + /** * unassign_adapter_store - parses the APID from @buf and clears the * corresponding bit in the mediated matrix device's APM @@ -698,6 +769,7 @@ static ssize_t unassign_adapter_store(struct device *dev, } clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm); + vfio_ap_mdev_unlink_adapter(matrix_mdev, apid); ret = count; done: mutex_unlock(&matrix_dev->lock); @@ -725,6 +797,16 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev, return 0; } +static void vfio_ap_mdev_link_domain(struct ap_matrix_mdev *matrix_mdev, + unsigned long apqi) +{ + unsigned long apid; + + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) + vfio_ap_mdev_link_apqn(matrix_mdev, + AP_MKQID(apid, apqi)); +} + /** * assign_domain_store - parses the APQI from @buf and sets the * corresponding bit in the mediated matrix device's AQM @@ -790,6 +872,7 @@ static ssize_t assign_domain_store(struct device *dev, if (ret) goto share_err; + vfio_ap_mdev_link_domain(matrix_mdev, apqi); ret = count; goto done; @@ -802,6 +885,19 @@ static ssize_t assign_domain_store(struct device *dev, } static DEVICE_ATTR_WO(assign_domain); +static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev, + unsigned long apqi) +{ + unsigned long apid; + struct vfio_ap_queue *q; + + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) { + q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + + if (q) + vfio_ap_mdev_unlink_queue(q); + } +} /** * unassign_domain_store - parses the APQI from @buf and clears the @@ -844,6 +940,7 @@ static ssize_t unassign_domain_store(struct device *dev, } clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm); + vfio_ap_mdev_unlink_domain(matrix_mdev, apqi); ret = count; done: @@ -1243,25 +1340,18 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev) { - int ret; - int rc = 0; - unsigned long apid, apqi; + int ret, bkt, rc = 0; struct vfio_ap_queue *q; - for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, - matrix_mdev->matrix.apm_max + 1) { - for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, - matrix_mdev->matrix.aqm_max + 1) { - q = vfio_ap_find_queue(AP_MKQID(apid, apqi)); - ret = vfio_ap_mdev_reset_queue(q, 1); - /* - * Regardless whether a queue turns out to be busy, or - * is not operational, we need to continue resetting - * the remaining queues. - */ - if (ret) - rc = ret; - } + hash_for_each(matrix_mdev->qtable.queues, bkt, q, mdev_qnode) { + ret = vfio_ap_mdev_reset_queue(q, 1); + /* + * Regardless whether a queue turns out to be busy, or + * is not operational, we need to continue resetting + * the remaining queues. + */ + if (ret) + rc = ret; } return rc; @@ -1410,6 +1500,28 @@ void vfio_ap_mdev_unregister(void) mdev_unregister_driver(&vfio_ap_matrix_driver); } +/* + * vfio_ap_queue_link_mdev + * + * @q: The queue to link with the matrix mdev. + * + * Links @q with the matrix mdev to which the queue's APQN is assigned. + */ +static void vfio_ap_queue_link_mdev(struct vfio_ap_queue *q) +{ + unsigned long apid = AP_QID_CARD(q->apqn); + unsigned long apqi = AP_QID_QUEUE(q->apqn); + struct ap_matrix_mdev *matrix_mdev; + + list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) { + if (test_bit_inv(apid, matrix_mdev->matrix.apm) && + test_bit_inv(apqi, matrix_mdev->matrix.aqm)) { + vfio_ap_mdev_link_queue(matrix_mdev, q); + break; + } + } +} + int vfio_ap_mdev_probe_queue(struct ap_device *apdev) { struct vfio_ap_queue *q; @@ -1417,9 +1529,11 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev) q = kzalloc(sizeof(*q), GFP_KERNEL); if (!q) return -ENOMEM; + mutex_lock(&matrix_dev->lock); q->apqn = to_ap_queue(&apdev->device)->qid; q->saved_isc = VFIO_AP_ISC_INVALID; + vfio_ap_queue_link_mdev(q); dev_set_drvdata(&apdev->device, q); mutex_unlock(&matrix_dev->lock); @@ -1432,6 +1546,10 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev) mutex_lock(&matrix_dev->lock); q = dev_get_drvdata(&apdev->device); + + if (q->matrix_mdev) + vfio_ap_unlink_queue_fr_mdev(q); + vfio_ap_mdev_reset_queue(q, 1); dev_set_drvdata(&apdev->device, NULL); kfree(q); diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index e3f0d42b094c..c1f57f89973e 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -19,6 +19,7 @@ #include #include #include +#include #include "ap_bus.h" @@ -74,6 +75,15 @@ struct ap_matrix { DECLARE_BITMAP(adm, 256); }; +/** + * struct ap_queue_table - a table of queue objects. + * + * @queues: a hashtable of queues (struct vfio_ap_queue). + */ +struct ap_queue_table { + DECLARE_HASHTABLE(queues, 8); +}; + /** * struct ap_matrix_mdev - Contains the data associated with a matrix mediated * device. @@ -89,6 +99,7 @@ struct ap_matrix { * @pqap_hook: the function pointer to the interception handler for the * PQAP(AQIC) instruction. * @mdev: the mediated device + * @qtable: table of queues (struct vfio_ap_queue) assigned to the mdev */ struct ap_matrix_mdev { struct vfio_device vdev; @@ -98,6 +109,7 @@ struct ap_matrix_mdev { struct notifier_block iommu_notifier; struct kvm *kvm; struct mdev_device *mdev; + struct ap_queue_table qtable; }; /** @@ -107,6 +119,7 @@ struct ap_matrix_mdev { * @saved_pfn: the guest PFN pinned for the guest * @apqn: the APQN of the AP queue device * @saved_isc: the guest ISC registered with the GIB interface + * @mdev_qnode: allows the vfio_ap_queue struct to be added to a hashtable */ struct vfio_ap_queue { struct ap_matrix_mdev *matrix_mdev; @@ -114,6 +127,7 @@ struct vfio_ap_queue { int apqn; #define VFIO_AP_ISC_INVALID 0xff unsigned char saved_isc; + struct hlist_node mdev_qnode; }; int vfio_ap_mdev_register(void); From patchwork Thu Oct 21 15:23:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54CD2C4332F for ; Thu, 21 Oct 2021 15:24:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BB2460C49 for ; Thu, 21 Oct 2021 15:24:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232048AbhJUP02 (ORCPT ); Thu, 21 Oct 2021 11:26:28 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:46118 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231937AbhJUP0V (ORCPT ); Thu, 21 Oct 2021 11:26:21 -0400 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEDou0023002; Thu, 21 Oct 2021 11:24:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=lbRiA6vc7uHKGYpWya4nBy1XDXU8qo6ZxSO1b9DOtXM=; b=XPx9yP7Y7VE0dqzp41FaqU/VZTHMY/GHCcsqoZUWAXunpmWtxPwZ0MB6BMVISRZeaggd U8cOXUmQ3cD7QWuSkt6WYHtWnYfNTDbPbrJeGrh2EQ8lkNFWzo6WcR4dWGk14wKyBdYY gI0fwZsIfYSAc4BnE4ljeyZaFL3gDGEZsaWIcmnPSLSo8Wqf4c64UPWPJBeoHwDRHWr3 2nYIzX8FWa+mvk3Y+lDRHK/hlAKzIjtpSoUpRslgLc6y8UB8VSRrWFxjSjjS+R59Q5dX lrBn7NcsRr7bdGyaGVd4EatbxFuh6mwcOexDdDKnTGBFSP2DHVSFI+tsrXkwyptz4RD3 2Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8yhb0ru-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:03 -0400 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFIb9a001486; Thu, 21 Oct 2021 11:24:02 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8yhb0rc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:02 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF3ifJ004818; Thu, 21 Oct 2021 15:24:01 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma02dal.us.ibm.com with ESMTP id 3bqpcd8hs7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:01 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFNxa338338950 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:00 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D1BAEBE061; Thu, 21 Oct 2021 15:23:59 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 32E2FBE07A; Thu, 21 Oct 2021 15:23:58 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:58 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 05/15] s390/vfio-ap: introduce shadow APCB Date: Thu, 21 Oct 2021 11:23:22 -0400 Message-Id: <20211021152332.70455-6-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: zW7BF0VNjPDJ1vM_I40qy0mne0XO9kpe X-Proofpoint-ORIG-GUID: l85YTJyh-80jGSHM40NyCUSgQggBjdcr X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 adultscore=0 suspectscore=0 lowpriorityscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The APCB is a field within the CRYCB that provides the AP configuration to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and maintain it for the lifespan of the guest. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic --- drivers/s390/crypto/vfio_ap_ops.c | 10 ++++++---- drivers/s390/crypto/vfio_ap_private.h | 2 ++ 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index e2be29e9d310..4305177029bf 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -332,6 +332,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev) matrix_mdev->mdev = mdev; vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix); + vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); hash_init(matrix_mdev->qtable.queues); mdev_set_drvdata(mdev, matrix_mdev); mutex_lock(&matrix_dev->lock); @@ -1178,10 +1179,11 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, kvm_get_kvm(kvm); matrix_mdev->kvm = kvm; kvm->arch.crypto.data = matrix_mdev; - kvm_arch_crypto_set_masks(kvm, - matrix_mdev->matrix.apm, - matrix_mdev->matrix.aqm, - matrix_mdev->matrix.adm); + memcpy(&matrix_mdev->shadow_apcb, &matrix_mdev->matrix, + sizeof(struct ap_matrix)); + kvm_arch_crypto_set_masks(kvm, matrix_mdev->shadow_apcb.apm, + matrix_mdev->shadow_apcb.aqm, + matrix_mdev->shadow_apcb.adm); mutex_unlock(&kvm->lock); mutex_unlock(&matrix_dev->lock); diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index c1f57f89973e..6dc0ebbf7a06 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -91,6 +91,7 @@ struct ap_queue_table { * @node: allows the ap_matrix_mdev struct to be added to a list * @matrix: the adapters, usage domains and control domains assigned to the * mediated matrix device. + * @shadow_apcb: the shadow copy of the APCB field of the KVM guest's CRYCB * @group_notifier: notifier block used for specifying callback function for * handling the VFIO_GROUP_NOTIFY_SET_KVM event * @iommu_notifier: notifier block used for specifying callback function for @@ -105,6 +106,7 @@ struct ap_matrix_mdev { struct vfio_device vdev; struct list_head node; struct ap_matrix matrix; + struct ap_matrix shadow_apcb; struct notifier_block group_notifier; struct notifier_block iommu_notifier; struct kvm *kvm; From patchwork Thu Oct 21 15:23:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8B3CC433EF for ; Thu, 21 Oct 2021 15:24:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 977E3611F2 for ; Thu, 21 Oct 2021 15:24:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231624AbhJUP01 (ORCPT ); Thu, 21 Oct 2021 11:26:27 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43986 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231984AbhJUP0X (ORCPT ); Thu, 21 Oct 2021 11:26:23 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBMEu003778; Thu, 21 Oct 2021 11:24:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=h78lrCCArAduV+fJtApwLdWepDVufhFooh00AmPjLc8=; b=FwlHsv4wUNYcpQ+oPz8BiM1vlRSOgLjouf4LEkPih91Q1myP/T5RPUF/BgieqfmUhvUu FpoYOtwPxe3RdcDD3ckeu17bhXS1cK42ppJcJkRXVdAG64xMfY5eT36S8OI6RT1l2m4+ b7lovAgHNRsFSBArJY6Di4eE+DVtpG1hGxhs0p2D9zFdlXAVdKTOtWGIMkecn1zhm95z EyvB3j5Av3/JqiM5hnOpmW94rZ5D34ENWywEUOHIPwa8j0Rw3RwS6x58ROn3ioVHqWWj bVW614ArgDzUxNGVeKdV95Bpfx2Go4h6VG+r4Y6oDEszHKXC0NGWJ1VPJZJ4j+jHTR7M sw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8kkkseh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:05 -0400 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEZKD8005174; Thu, 21 Oct 2021 11:24:04 -0400 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu8kkkse3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:04 -0400 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF427G018888; Thu, 21 Oct 2021 15:24:03 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma02wdc.us.ibm.com with ESMTP id 3bqpccfsg3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:03 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFO2TS32899810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:02 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 50C5CBE05D; Thu, 21 Oct 2021 15:24:02 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 16CEBBE061; Thu, 21 Oct 2021 15:24:00 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:23:59 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 06/15] s390/vfio-ap: refresh guest's APCB by filtering APQNs assigned to mdev Date: Thu, 21 Oct 2021 11:23:23 -0400 Message-Id: <20211021152332.70455-7-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: c-uzxaaaWEvVQox1FYkuY40dNMRKsmsd X-Proofpoint-GUID: qNHpspt_Y6wgzdzeo2cdYjAjv8pQC3uV X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 bulkscore=0 malwarescore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 mlxscore=0 phishscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Refresh the guest's APCB by filtering the APQNs assigned to the matrix mdev that do not reference an AP queue device bound to the vfio_ap device driver. The mdev's APQNs will be filtered according to the following rules: * The APID of each adapter and the APQI of each domain that is not in the host's AP configuration is filtered out. * The APID of each adapter comprising an APQN that does not reference a queue device bound to the vfio_ap device driver is filtered. The APQNs are derived from the Cartesian product of the APID of each adapter and APQI of each domain assigned to the mdev. The control domains that are not assigned to the host's AP configuration will also be filtered before assigning them to the guest's APCB. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_ops.c | 66 ++++++++++++++++++++++++++++++- 1 file changed, 64 insertions(+), 2 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 4305177029bf..46c179363aca 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -314,6 +314,62 @@ static void vfio_ap_matrix_init(struct ap_config_info *info, matrix->adm_max = info->apxa ? info->Nd : 15; } +static void vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) +{ + bitmap_and(matrix_mdev->shadow_apcb.adm, matrix_mdev->matrix.adm, + (unsigned long *)matrix_dev->info.adm, AP_DOMAINS); +} + +/* + * vfio_ap_mdev_filter_matrix - copy the mdev's AP configuration to the KVM + * guest's APCB then filter the APIDs that do not + * comprise at least one APQN that references a + * queue device bound to the vfio_ap device driver. + * + * @matrix_mdev: the mdev whose AP configuration is to be filtered. + */ +static void vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) +{ + int ret; + unsigned long apid, apqi, apqn; + + ret = ap_qci(&matrix_dev->info); + if (ret) + return; + + vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); + + /* + * Copy the adapters, domains and control domains to the shadow_apcb + * from the matrix mdev, but only those that are assigned to the host's + * AP configuration. + */ + bitmap_and(matrix_mdev->shadow_apcb.apm, matrix_mdev->matrix.apm, + (unsigned long *)matrix_dev->info.apm, AP_DEVICES); + bitmap_and(matrix_mdev->shadow_apcb.aqm, matrix_mdev->matrix.aqm, + (unsigned long *)matrix_dev->info.aqm, AP_DOMAINS); + + for_each_set_bit_inv(apid, matrix_mdev->shadow_apcb.apm, AP_DEVICES) { + for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, + AP_DOMAINS) { + /* + * If the APQN is not bound to the vfio_ap device + * driver, then we can't assign it to the guest's + * AP configuration. The AP architecture won't + * allow filtering of a single APQN, so if we're + * filtering APIDs, then filter the APID; otherwise, + * filter the APQI. + */ + apqn = AP_MKQID(apid, apqi); + if (!vfio_ap_mdev_get_queue(matrix_mdev, apqn)) { + clear_bit_inv(apid, + matrix_mdev->shadow_apcb.apm); + break; + } + } + } +} + static int vfio_ap_mdev_probe(struct mdev_device *mdev) { struct ap_matrix_mdev *matrix_mdev; @@ -703,6 +759,7 @@ static ssize_t assign_adapter_store(struct device *dev, goto share_err; vfio_ap_mdev_link_adapter(matrix_mdev, apid); + vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; goto done; @@ -771,6 +828,7 @@ static ssize_t unassign_adapter_store(struct device *dev, clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm); vfio_ap_mdev_unlink_adapter(matrix_mdev, apid); + vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; done: mutex_unlock(&matrix_dev->lock); @@ -874,6 +932,7 @@ static ssize_t assign_domain_store(struct device *dev, goto share_err; vfio_ap_mdev_link_domain(matrix_mdev, apqi); + vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; goto done; @@ -942,6 +1001,7 @@ static ssize_t unassign_domain_store(struct device *dev, clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm); vfio_ap_mdev_unlink_domain(matrix_mdev, apqi); + vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; done: @@ -995,6 +1055,7 @@ static ssize_t assign_control_domain_store(struct device *dev, * number of control domains that can be assigned. */ set_bit_inv(id, matrix_mdev->matrix.adm); + vfio_ap_mdev_filter_cdoms(matrix_mdev); ret = count; done: mutex_unlock(&matrix_dev->lock); @@ -1042,6 +1103,7 @@ static ssize_t unassign_control_domain_store(struct device *dev, } clear_bit_inv(domid, matrix_mdev->matrix.adm); + clear_bit_inv(domid, matrix_mdev->shadow_apcb.adm); ret = count; done: mutex_unlock(&matrix_dev->lock); @@ -1179,8 +1241,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, kvm_get_kvm(kvm); matrix_mdev->kvm = kvm; kvm->arch.crypto.data = matrix_mdev; - memcpy(&matrix_mdev->shadow_apcb, &matrix_mdev->matrix, - sizeof(struct ap_matrix)); kvm_arch_crypto_set_masks(kvm, matrix_mdev->shadow_apcb.apm, matrix_mdev->shadow_apcb.aqm, matrix_mdev->shadow_apcb.adm); @@ -1536,6 +1596,8 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev) q->apqn = to_ap_queue(&apdev->device)->qid; q->saved_isc = VFIO_AP_ISC_INVALID; vfio_ap_queue_link_mdev(q); + if (q->matrix_mdev) + vfio_ap_mdev_filter_matrix(q->matrix_mdev); dev_set_drvdata(&apdev->device, q); mutex_unlock(&matrix_dev->lock); From patchwork Thu Oct 21 15:23:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 008EBC433EF for ; Thu, 21 Oct 2021 15:24:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E10C661183 for ; Thu, 21 Oct 2021 15:24:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232113AbhJUP0e (ORCPT ); Thu, 21 Oct 2021 11:26:34 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:5282 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231992AbhJUP01 (ORCPT ); Thu, 21 Oct 2021 11:26:27 -0400 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEI2Jp012432; Thu, 21 Oct 2021 11:24:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=D7hm2zAH0OQlC3ZqjKAngVZg5HevCVBT6DR2iUOtL8s=; b=REW/DkUO5eMhPkqEK06BMrR5QUjTMxvIJOS/nULEezXAVDOLdIgW34sjhxpdYnKSQfrc eLBel0QgZMXzfMXsf40Mkpv63zGEFdKfD+6fSqi847xr3+aQUXKRXsor7UzIPAeNc9wS LiCJ7a4ailBySLmeRS2qT+xmVy0goKe7zMd+1+pCDpkLjaAN40YQFt0pOI2k+wLm+Cha 5OgQPPC7Y/TJi35+CXQsgfU8wLlt1+sF3qsz2C0Cmcf2APoopgu0jeLShS4T2DN4glV/ ySqDUfF6DCzPAjXON9mFBAY1rhLe1y5UV4lucUho3SJ2XJlMa3Cpnxc3a4vw+lJ/SlU1 Lg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3btxuu05j1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:07 -0400 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEnQqh004170; Thu, 21 Oct 2021 11:24:07 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 3btxuu05hb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:07 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF42Ip009635; Thu, 21 Oct 2021 15:24:06 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma01wdc.us.ibm.com with ESMTP id 3bqpccfqq2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:06 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFO4rB44892588 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:04 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 98D9EBE065; Thu, 21 Oct 2021 15:24:04 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7A904BE051; Thu, 21 Oct 2021 15:24:02 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:02 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 07/15] s390/vfio-ap: allow assignment of unavailable AP queues to mdev device Date: Thu, 21 Oct 2021 11:23:24 -0400 Message-Id: <20211021152332.70455-8-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: I99A1cc_c8lPwoFyzKmeGy3ORMgHC9D7 X-Proofpoint-GUID: -pYj0KzExDa9yYJgf9ulseWAg7WW8hW2 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 adultscore=0 suspectscore=0 lowpriorityscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 clxscore=1015 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The current implementation does not allow assignment of an AP adapter or domain to an mdev device if each APQN resulting from the assignment does not reference an AP queue device that is bound to the vfio_ap device driver. This patch allows assignment of AP resources to the matrix mdev as long as the APQNs resulting from the assignment: 1. Are not reserved by the AP BUS for use by the zcrypt device drivers. 2. Are not assigned to another matrix mdev. The rationale behind this is that the AP architecture does not preclude assignment of APQNs to an AP configuration profile that are not available to the system. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic --- drivers/s390/crypto/vfio_ap_ops.c | 224 +++++++----------------------- 1 file changed, 53 insertions(+), 171 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 46c179363aca..6b40db6dab3c 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -520,141 +520,48 @@ static struct attribute_group *vfio_ap_mdev_type_groups[] = { NULL, }; -struct vfio_ap_queue_reserved { - unsigned long *apid; - unsigned long *apqi; - bool reserved; -}; +#define MDEV_SHARING_ERR "Userspace may not re-assign queue %02lx.%04lx " \ + "already assigned to %s" -/** - * vfio_ap_has_queue - determines if the AP queue containing the target in @data - * - * @dev: an AP queue device - * @data: a struct vfio_ap_queue_reserved reference - * - * Flags whether the AP queue device (@dev) has a queue ID containing the APQN, - * apid or apqi specified in @data: - * - * - If @data contains both an apid and apqi value, then @data will be flagged - * as reserved if the APID and APQI fields for the AP queue device matches - * - * - If @data contains only an apid value, @data will be flagged as - * reserved if the APID field in the AP queue device matches - * - * - If @data contains only an apqi value, @data will be flagged as - * reserved if the APQI field in the AP queue device matches - * - * Return: 0 to indicate the input to function succeeded. Returns -EINVAL if - * @data does not contain either an apid or apqi. - */ -static int vfio_ap_has_queue(struct device *dev, void *data) +static void vfio_ap_mdev_log_sharing_err(struct ap_matrix_mdev *matrix_mdev, + unsigned long *apm, + unsigned long *aqm) { - struct vfio_ap_queue_reserved *qres = data; - struct ap_queue *ap_queue = to_ap_queue(dev); - ap_qid_t qid; - unsigned long id; - - if (qres->apid && qres->apqi) { - qid = AP_MKQID(*qres->apid, *qres->apqi); - if (qid == ap_queue->qid) - qres->reserved = true; - } else if (qres->apid && !qres->apqi) { - id = AP_QID_CARD(ap_queue->qid); - if (id == *qres->apid) - qres->reserved = true; - } else if (!qres->apid && qres->apqi) { - id = AP_QID_QUEUE(ap_queue->qid); - if (id == *qres->apqi) - qres->reserved = true; - } else { - return -EINVAL; - } - - return 0; -} - -/** - * vfio_ap_verify_queue_reserved - verifies that the AP queue containing - * @apid or @aqpi is reserved - * - * @apid: an AP adapter ID - * @apqi: an AP queue index - * - * Verifies that the AP queue with @apid/@apqi is reserved by the VFIO AP device - * driver according to the following rules: - * - * - If both @apid and @apqi are not NULL, then there must be an AP queue - * device bound to the vfio_ap driver with the APQN identified by @apid and - * @apqi - * - * - If only @apid is not NULL, then there must be an AP queue device bound - * to the vfio_ap driver with an APQN containing @apid - * - * - If only @apqi is not NULL, then there must be an AP queue device bound - * to the vfio_ap driver with an APQN containing @apqi - * - * Return: 0 if the AP queue is reserved; otherwise, returns -EADDRNOTAVAIL. - */ -static int vfio_ap_verify_queue_reserved(unsigned long *apid, - unsigned long *apqi) -{ - int ret; - struct vfio_ap_queue_reserved qres; - - qres.apid = apid; - qres.apqi = apqi; - qres.reserved = false; - - ret = driver_for_each_device(&matrix_dev->vfio_ap_drv->driver, NULL, - &qres, vfio_ap_has_queue); - if (ret) - return ret; - - if (qres.reserved) - return 0; - - return -EADDRNOTAVAIL; -} - -static int -vfio_ap_mdev_verify_queues_reserved_for_apid(struct ap_matrix_mdev *matrix_mdev, - unsigned long apid) -{ - int ret; - unsigned long apqi; - unsigned long nbits = matrix_mdev->matrix.aqm_max + 1; - - if (find_first_bit_inv(matrix_mdev->matrix.aqm, nbits) >= nbits) - return vfio_ap_verify_queue_reserved(&apid, NULL); - - for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, nbits) { - ret = vfio_ap_verify_queue_reserved(&apid, &apqi); - if (ret) - return ret; - } + unsigned long apid, apqi; + const struct device *dev = mdev_dev(matrix_mdev->mdev); + const char *mdev_name = dev_name(dev); - return 0; + for_each_set_bit_inv(apid, apm, AP_DEVICES) + for_each_set_bit_inv(apqi, aqm, AP_DOMAINS) + dev_warn(dev, MDEV_SHARING_ERR, apid, apqi, mdev_name); } /** - * vfio_ap_mdev_verify_no_sharing - verifies that the AP matrix is not configured + * vfio_ap_mdev_verify_no_sharing - verify APQNs are not shared by matrix mdevs * - * @matrix_mdev: the mediated matrix device + * @mdev_apm: mask indicating the APIDs of the APQNs to be verified + * @mdev_aqm: mask indicating the APQIs of the APQNs to be verified * - * Verifies that the APQNs derived from the cross product of the AP adapter IDs - * and AP queue indexes comprising the AP matrix are not configured for another + * Verifies that each APQN derived from the Cartesian product of a bitmap of + * AP adapter IDs and AP queue indexes is not configured for any matrix * mediated device. AP queue sharing is not allowed. * - * Return: 0 if the APQNs are not shared; otherwise returns -EADDRINUSE. + * Return: 0 if the APQNs are not shared; otherwise return -EADDRINUSE. */ -static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev) +static int vfio_ap_mdev_verify_no_sharing(unsigned long *mdev_apm, + unsigned long *mdev_aqm) { - struct ap_matrix_mdev *lstdev; + struct ap_matrix_mdev *matrix_mdev; DECLARE_BITMAP(apm, AP_DEVICES); DECLARE_BITMAP(aqm, AP_DOMAINS); - list_for_each_entry(lstdev, &matrix_dev->mdev_list, node) { - if (matrix_mdev == lstdev) + list_for_each_entry(matrix_mdev, &matrix_dev->mdev_list, node) { + /* + * If the input apm and aqm belong to the matrix_mdev's matrix, + * then move on to the next. + */ + if (mdev_apm == matrix_mdev->matrix.apm && + mdev_aqm == matrix_mdev->matrix.aqm) continue; memset(apm, 0, sizeof(apm)); @@ -664,20 +571,32 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev) * We work on full longs, as we can only exclude the leftover * bits in non-inverse order. The leftover is all zeros. */ - if (!bitmap_and(apm, matrix_mdev->matrix.apm, - lstdev->matrix.apm, AP_DEVICES)) + if (!bitmap_and(apm, mdev_apm, matrix_mdev->matrix.apm, + AP_DEVICES)) continue; - if (!bitmap_and(aqm, matrix_mdev->matrix.aqm, - lstdev->matrix.aqm, AP_DOMAINS)) + if (!bitmap_and(aqm, mdev_aqm, matrix_mdev->matrix.aqm, + AP_DOMAINS)) continue; + vfio_ap_mdev_log_sharing_err(matrix_mdev, apm, aqm); + return -EADDRINUSE; } return 0; } +static int vfio_ap_mdev_validate_masks(struct ap_matrix_mdev *matrix_mdev) +{ + if (ap_apqn_in_matrix_owned_by_def_drv(matrix_mdev->matrix.apm, + matrix_mdev->matrix.aqm)) + return -EADDRNOTAVAIL; + + return vfio_ap_mdev_verify_no_sharing(matrix_mdev->matrix.apm, + matrix_mdev->matrix.aqm); +} + static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, unsigned long apid) { @@ -743,28 +662,17 @@ static ssize_t assign_adapter_store(struct device *dev, goto done; } - /* - * Set the bit in the AP mask (APM) corresponding to the AP adapter - * number (APID). The bits in the mask, from most significant to least - * significant bit, correspond to APIDs 0-255. - */ - ret = vfio_ap_mdev_verify_queues_reserved_for_apid(matrix_mdev, apid); - if (ret) - goto done; - set_bit_inv(apid, matrix_mdev->matrix.apm); - ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev); - if (ret) - goto share_err; + ret = vfio_ap_mdev_validate_masks(matrix_mdev); + if (ret) { + clear_bit_inv(apid, matrix_mdev->matrix.apm); + goto done; + } vfio_ap_mdev_link_adapter(matrix_mdev, apid); vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; - goto done; - -share_err: - clear_bit_inv(apid, matrix_mdev->matrix.apm); done: mutex_unlock(&matrix_dev->lock); @@ -836,26 +744,6 @@ static ssize_t unassign_adapter_store(struct device *dev, } static DEVICE_ATTR_WO(unassign_adapter); -static int -vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev, - unsigned long apqi) -{ - int ret; - unsigned long apid; - unsigned long nbits = matrix_mdev->matrix.apm_max + 1; - - if (find_first_bit_inv(matrix_mdev->matrix.apm, nbits) >= nbits) - return vfio_ap_verify_queue_reserved(NULL, &apqi); - - for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, nbits) { - ret = vfio_ap_verify_queue_reserved(&apid, &apqi); - if (ret) - return ret; - } - - return 0; -} - static void vfio_ap_mdev_link_domain(struct ap_matrix_mdev *matrix_mdev, unsigned long apqi) { @@ -921,23 +809,17 @@ static ssize_t assign_domain_store(struct device *dev, goto done; } - ret = vfio_ap_mdev_verify_queues_reserved_for_apqi(matrix_mdev, apqi); - if (ret) - goto done; - set_bit_inv(apqi, matrix_mdev->matrix.aqm); - ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev); - if (ret) - goto share_err; + ret = vfio_ap_mdev_validate_masks(matrix_mdev); + if (ret) { + clear_bit_inv(apqi, matrix_mdev->matrix.aqm); + goto done; + } vfio_ap_mdev_link_domain(matrix_mdev, apqi); vfio_ap_mdev_filter_matrix(matrix_mdev); ret = count; - goto done; - -share_err: - clear_bit_inv(apqi, matrix_mdev->matrix.aqm); done: mutex_unlock(&matrix_dev->lock); From patchwork Thu Oct 21 15:23:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E227EC433FE for ; Thu, 21 Oct 2021 15:24:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CAFCE60C49 for ; Thu, 21 Oct 2021 15:24:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232089AbhJUP0c (ORCPT ); Thu, 21 Oct 2021 11:26:32 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:22984 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232042AbhJUP01 (ORCPT ); Thu, 21 Oct 2021 11:26:27 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBIgQ022324; Thu, 21 Oct 2021 11:24:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=VbQ/mJuCdGyC9racNtzwgMXvwCvv5nztjkVzM3JKDFc=; b=GilyY4YpUEDDbvknH3PakyEQHyzC39oNqZ+ZAAOXlunqg/qYzYdhob4F3h3jct0OYep0 Ij1xJsl2wiQjW/xW7xVsIBI61Z02luGsXc6eUDv6Aelb04RdeLCUUtM0sbh4YrS9Auic 9cGmR0oP/et5CdRncof4HQi1jG2RwfJ1+cJvxDh9ZNRyV5/WcvNDCICEuYeqX9mh5SCL dQdf5TZ4Vv0NrB8VdGqz8EEm5/k2mghiZsGtG8mwKYTMW8rtKqCiZ5IK3ge3Dy55OEfa Nwuv2ugMf5wJirnBPRrN52DaqqtQuJdJS9n04zhSAW3ic/LvX8Zz2Ppd+MtdmvzRIryo zg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bty9eqs4f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:10 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFEsd0018289; Thu, 21 Oct 2021 11:24:09 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bty9eqs42-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:09 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF42IY009616; Thu, 21 Oct 2021 15:24:08 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma01wdc.us.ibm.com with ESMTP id 3bqpccfqqr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:08 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFO7d119989052 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:07 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0A229BE05B; Thu, 21 Oct 2021 15:24:07 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CC32FBE051; Thu, 21 Oct 2021 15:24:04 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:04 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 08/15] s390/vfio-ap: keep track of active guests Date: Thu, 21 Oct 2021 11:23:25 -0400 Message-Id: <20211021152332.70455-9-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: fOP7575zIzd1O7jp5y4EwFbM5yQ0QO9h X-Proofpoint-GUID: Wr_NL-PjEdJsapzS7D36c3J-pK9gWaE5 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 priorityscore=1501 malwarescore=0 spamscore=0 phishscore=0 lowpriorityscore=0 mlxlogscore=999 bulkscore=0 impostorscore=0 clxscore=1015 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The vfio_ap device driver registers for notification when the pointer to the KVM object for a guest is set. Let's store the KVM pointer as well as the pointer to the mediated device when the KVM pointer is set. The reason for storing the KVM and mediated device pointers is to facilitate hot plug/unplug of AP queues for a KVM guest when a queue device is probed or removed. When a guest's APCB is hot plugged into the guest, the kvm->lock must be taken prior to taking the matrix_dev->lock, or there is potential for a lockdep splat (see below). Unfortunately, when a queue is probed or removed, we have no idea whether it is assigned to a guest or which KVM object is associated with the guest. If we take the matrix_dev->lock to determine whether the APQN is assigned to a running guest then subsequently take the kvm->lock, in certain situations that will result in a lockdep splat: * see commit 0cc00c8d4050 ("Fix circular lockdep when setting/clearing crypto masks") * see commit 86956e70761b ("replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification") The reason a lockdep splat can occur has to do with the fact that the kvm->lock has to be taken before the vcpu->lock; so, for example, when a secure execution guest is started, you may end up with the following scenario: Interception of PQAP(AQIC) instruction executed on the guest: ------------------------------------------------------------ handle_pqap: matrix_dev->lock kvm_vcpu_ioctl: vcpu_mutex Start of secure execution guest: ------------------------------- kvm_s390_cpus_to_pv: vcpu->mutex kvm_arch_vm_ioctl: kvm->lock Queue is unbound from vfio_ap device driver: ------------------------------------------- kvm->lock vfio_ap_mdev_remove_queue: matrix_dev->lock This patch introduces a new ap_guest structure into which the pointers to the kvm and matrix_mdev can be stored. It also introduces two new fields in the struct ap_matrix_dev: struct ap_matrix_dev { ... struct rw_semaphore guests_lock; struct list_head guests; ... } The 'guests_lock' field is a r/w semaphore to control access to the 'guests' field. The 'guests' field is a list of ap_guest structures containing the KVM and matrix_mdev pointers for each active guest. An ap_guest structure will be stored into the list whenever the vfio_ap device driver is notified that the KVM pointer has been set and removed when notified that the KVM pointer has been cleared. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_drv.c | 2 ++ drivers/s390/crypto/vfio_ap_ops.c | 44 +++++++++++++++++++++++++-- drivers/s390/crypto/vfio_ap_private.h | 10 ++++++ 3 files changed, 53 insertions(+), 3 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index 5255e338591d..1d1746fe50ea 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -98,6 +98,8 @@ static int vfio_ap_matrix_dev_create(void) mutex_init(&matrix_dev->lock); INIT_LIST_HEAD(&matrix_dev->mdev_list); + init_rwsem(&matrix_dev->guests_lock); + INIT_LIST_HEAD(&matrix_dev->guests); dev_set_name(&matrix_dev->device, "%s", VFIO_AP_DEV_NAME); matrix_dev->device.parent = root_device; diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6b40db6dab3c..a2875cf79091 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1086,6 +1086,20 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = { NULL }; +static int vfio_ap_mdev_create_guest(struct kvm *kvm, + struct ap_matrix_mdev *matrix_mdev) +{ + struct ap_guest *guest; + + guest = kzalloc(sizeof(*guest), GFP_KERNEL); + if (!guest) + return -ENOMEM; + + list_add(&guest->node, &matrix_dev->guests); + + return 0; +} + /** * vfio_ap_mdev_set_kvm - sets all data for @matrix_mdev that are needed * to manage AP resources for the guest whose state is represented by @kvm @@ -1106,16 +1120,23 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = { static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, struct kvm *kvm) { + int ret; struct ap_matrix_mdev *m; if (kvm->arch.crypto.crycbd) { + down_write(&matrix_dev->guests_lock); + ret = vfio_ap_mdev_create_guest(kvm, matrix_mdev); + if (WARN_ON(ret)) + return ret; + mutex_lock(&kvm->lock); mutex_lock(&matrix_dev->lock); list_for_each_entry(m, &matrix_dev->mdev_list, node) { if (m != matrix_mdev && m->kvm == kvm) { - mutex_unlock(&kvm->lock); mutex_unlock(&matrix_dev->lock); + mutex_unlock(&kvm->lock); + up_write(&matrix_dev->guests_lock); return -EPERM; } } @@ -1127,8 +1148,9 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, matrix_mdev->shadow_apcb.aqm, matrix_mdev->shadow_apcb.adm); - mutex_unlock(&kvm->lock); mutex_unlock(&matrix_dev->lock); + mutex_unlock(&kvm->lock); + up_write(&matrix_dev->guests_lock); } return 0; @@ -1164,6 +1186,18 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, return NOTIFY_DONE; } +static void vfio_ap_mdev_remove_guest(struct kvm *kvm) +{ + struct ap_guest *guest, *tmp; + + list_for_each_entry_safe(guest, tmp, &matrix_dev->guests, node) { + if (guest->kvm == kvm) { + list_del(&guest->node); + break; + } + } +} + /** * vfio_ap_mdev_unset_kvm - performs clean-up of resources no longer needed * by @matrix_mdev. @@ -1182,6 +1216,9 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev, struct kvm *kvm) { if (kvm && kvm->arch.crypto.crycbd) { + down_write(&matrix_dev->guests_lock); + vfio_ap_mdev_remove_guest(kvm); + mutex_lock(&kvm->lock); mutex_lock(&matrix_dev->lock); @@ -1191,8 +1228,9 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev, kvm->arch.crypto.data = NULL; matrix_mdev->kvm = NULL; - mutex_unlock(&kvm->lock); mutex_unlock(&matrix_dev->lock); + mutex_unlock(&kvm->lock); + up_write(&matrix_dev->guests_lock); } } diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 6dc0ebbf7a06..6d28b287d7bf 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -26,6 +26,11 @@ #define VFIO_AP_MODULE_NAME "vfio_ap" #define VFIO_AP_DRV_NAME "vfio_ap" +struct ap_guest { + struct kvm *kvm; + struct list_head node; +}; + /** * struct ap_matrix_dev - Contains the data for the matrix device. * @@ -39,6 +44,9 @@ * single ap_matrix_mdev device. It's quite coarse but we don't * expect much contention. * @vfio_ap_drv: the vfio_ap device driver + * @guests_lock: r/w semaphore for protecting access to @guests + * @guests: list of guests (struct ap_guest) using AP devices bound to the + * vfio_ap device driver. */ struct ap_matrix_dev { struct device device; @@ -47,6 +55,8 @@ struct ap_matrix_dev { struct list_head mdev_list; struct mutex lock; struct ap_driver *vfio_ap_drv; + struct rw_semaphore guests_lock; + struct list_head guests; }; extern struct ap_matrix_dev *matrix_dev; From patchwork Thu Oct 21 15:23:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A677C4332F for ; Thu, 21 Oct 2021 15:24:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DBB86136F for ; Thu, 21 Oct 2021 15:24:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232138AbhJUP0g (ORCPT ); Thu, 21 Oct 2021 11:26:36 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:12428 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231833AbhJUP03 (ORCPT ); Thu, 21 Oct 2021 11:26:29 -0400 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBxZ2026868; Thu, 21 Oct 2021 11:24:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Fwa+ixfhx945VSlyuzhpDZvI7tkAN13lmAFl4Vhr6r8=; b=OQyuPRmnU8OuYmIeK+aJmU4eJByJ/mz67yDpYouMEMavGEXGaAPRAxFbBfVCcEeo0k3e Yc14t+JhfN5tmvLRPoaqjPfyxkKD7WphehtoBCkA+viMRqssfxsUWdtprZ1oVu8b76gH YRxA23TPqP0WqcdDmeN7k9UWczFcIRlWeXNlQzk/Q9yE8VJXrWgUKMWcb8l6D9631aoI 4Za1SSqybuth07GSNp1q6mskuZ942wYY9DnbFisOd2qBVMowz7D8GyHagSIllxZqsAC2 cqR/uG+GauWb1CwlQjRx//CVV5q4hXtMjiXvy4B9OXU6rX9ojm2Bqxgq+0t2nEvA+pHF ug== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3btthkw4w8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:12 -0400 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LF0FsT029507; Thu, 21 Oct 2021 11:24:12 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 3btthkw4vt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:11 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF3ifX004818; Thu, 21 Oct 2021 15:24:11 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma02dal.us.ibm.com with ESMTP id 3bqpcd8hux-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:10 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFO9sb22282762 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:09 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2D58EBE05D; Thu, 21 Oct 2021 15:24:09 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3DC18BE05B; Thu, 21 Oct 2021 15:24:07 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:07 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 09/15] s390/vfio-ap: allow hot plug/unplug of AP resources using mdev device Date: Thu, 21 Oct 2021 11:23:26 -0400 Message-Id: <20211021152332.70455-10-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: qG8TzNi689NroA-AmlAGVNjDzzukGEpd X-Proofpoint-GUID: C86LBx8eYtpO1jHO5jjCP7OmD_evkxZb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 clxscore=1015 mlxscore=0 suspectscore=0 bulkscore=0 phishscore=0 spamscore=0 impostorscore=0 malwarescore=0 priorityscore=1501 lowpriorityscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's allow adapters, domains and control domains to be hot plugged into and hot unplugged from a KVM guest using a matrix mdev when: * The adapter, domain or control domain is assigned to or unassigned from the matrix mdev * A queue device with an APQN assigned to the matrix mdev is bound to or unbound from the vfio_ap device driver. Whenever an assignment or unassignment of an adapter, domain or control domain is performed as well as when a bind or unbind of a queue device is executed, the AP configuration assigned to the matrix mediated device will be filtered and assigned to the AP control block (APCB) that supplies the AP configuration to the guest so that no adapter, domain or control domain that is not in the host's AP configuration nor any APQN that does not reference a queue device bound to the vfio_ap device driver is assigned. After updating the APCB, if the mdev is in use by a KVM guest, it is hot plugged into the guest to dynamically provide access to the adapters, domains and control domains provided via the newly refreshed APCB. Keep in mind that the kvm->lock must be taken outside of the matrix_mdev->lock to avoid circular lock dependencies (i.e., a lockdep splat). This will necessitate taking the matrix_dev->guests_lock in order to find the guest(s) in the matrix_dev->guests list to which the affected APQN(s) may be assigned. The kvm->lock can then be taken prior to the matrix_dev->lock and the APCB plugged into the guest without any problem. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_ops.c | 388 ++++++++++++++++---------- drivers/s390/crypto/vfio_ap_private.h | 7 +- 2 files changed, 238 insertions(+), 157 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index a2875cf79091..5a484e7afbd0 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -105,8 +105,8 @@ static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q) if (!q) return; if (q->saved_isc != VFIO_AP_ISC_INVALID && - !WARN_ON(!(q->matrix_mdev && q->matrix_mdev->kvm))) { - kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc); + !WARN_ON(!(q->matrix_mdev && q->matrix_mdev->guest->kvm))) { + kvm_s390_gisc_unregister(q->matrix_mdev->guest->kvm, q->saved_isc); q->saved_isc = VFIO_AP_ISC_INVALID; } if (q->saved_pfn && !WARN_ON(!q->matrix_mdev)) { @@ -211,7 +211,7 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, return status; } - kvm = q->matrix_mdev->kvm; + kvm = q->matrix_mdev->guest->kvm; gisa = kvm->arch.gisa_int.origin; h_nib = (h_pfn << PAGE_SHIFT) | (nib & ~PAGE_MASK); @@ -283,7 +283,7 @@ static int handle_pqap(struct kvm_vcpu *vcpu) matrix_mdev = vcpu->kvm->arch.crypto.data; /* If the there is no guest using the mdev, there is nothing to do */ - if (!matrix_mdev || !matrix_mdev->kvm) + if (!matrix_mdev || !matrix_mdev->guest->kvm) goto out_unlock; q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); @@ -314,10 +314,25 @@ static void vfio_ap_matrix_init(struct ap_config_info *info, matrix->adm_max = info->apxa ? info->Nd : 15; } -static void vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) +static void vfio_ap_mdev_hotplug_apcb(struct ap_matrix_mdev *matrix_mdev) { + if (matrix_mdev->guest->kvm) + kvm_arch_crypto_set_masks(matrix_mdev->guest->kvm, + matrix_mdev->shadow_apcb.apm, + matrix_mdev->shadow_apcb.aqm, + matrix_mdev->shadow_apcb.adm); +} + +static bool vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) +{ + DECLARE_BITMAP(shadow_adm, AP_DOMAINS); + + bitmap_copy(shadow_adm, matrix_mdev->shadow_apcb.adm, AP_DOMAINS); bitmap_and(matrix_mdev->shadow_apcb.adm, matrix_mdev->matrix.adm, (unsigned long *)matrix_dev->info.adm, AP_DOMAINS); + + return !bitmap_equal(shadow_adm, matrix_mdev->shadow_apcb.adm, + AP_DOMAINS); } /* @@ -327,16 +342,23 @@ static void vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) * queue device bound to the vfio_ap device driver. * * @matrix_mdev: the mdev whose AP configuration is to be filtered. + * + * Return: a boolean value indicating whether the KVM guest's APCB was changed + * by the filtering or not. */ -static void vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) +static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) { int ret; unsigned long apid, apqi, apqn; + DECLARE_BITMAP(shadow_apm, AP_DEVICES); + DECLARE_BITMAP(shadow_aqm, AP_DOMAINS); ret = ap_qci(&matrix_dev->info); if (ret) - return; + return false; + bitmap_copy(shadow_apm, matrix_mdev->shadow_apcb.apm, AP_DEVICES); + bitmap_copy(shadow_aqm, matrix_mdev->shadow_apcb.aqm, AP_DOMAINS); vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); /* @@ -368,6 +390,11 @@ static void vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) } } } + + return !bitmap_equal(shadow_apm, matrix_mdev->shadow_apcb.apm, + AP_DEVICES) || + !bitmap_equal(shadow_aqm, matrix_mdev->shadow_apcb.aqm, + AP_DOMAINS); } static int vfio_ap_mdev_probe(struct mdev_device *mdev) @@ -607,6 +634,37 @@ static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, AP_MKQID(apid, apqi)); } +/** + * vfio_ap_mdev_get_locks - lock the kvm->lock and matrix_dev->lock mutexes + * + * @matrix_mdev: the matrix mediated device object + */ +static void vfio_ap_mdev_get_locks(struct ap_matrix_mdev *matrix_mdev) +{ + down_read(&matrix_dev->guests_lock); + + /* The kvm->lock must be must be taken before the matrix_dev->lock */ + if (matrix_mdev->guest) + mutex_lock(&matrix_mdev->guest->kvm->lock); + + mutex_lock(&matrix_dev->lock); +} + +/** + * vfio_ap_mdev_put_locks - release the kvm->lock and matrix_dev->lock mutexes + * + * @matrix_mdev: the matrix mediated device object + */ +static void vfio_ap_mdev_put_locks(struct ap_matrix_mdev *matrix_mdev) +{ + /* The kvm->lock must be must be taken before the matrix_dev->lock */ + if (matrix_mdev->guest) + mutex_unlock(&matrix_mdev->guest->kvm->lock); + + mutex_unlock(&matrix_dev->lock); + up_read(&matrix_dev->guests_lock); +} + /** * assign_adapter_store - parses the APID from @buf and sets the * corresponding bit in the mediated matrix device's APM @@ -645,23 +703,14 @@ static ssize_t assign_adapter_store(struct device *dev, unsigned long apid; struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); - mutex_lock(&matrix_dev->lock); - - /* If the KVM guest is running, disallow assignment of adapter */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &apid); if (ret) - goto done; + return ret; - if (apid > matrix_mdev->matrix.apm_max) { - ret = -ENODEV; - goto done; - } + if (apid > matrix_mdev->matrix.apm_max) + return -ENODEV; + vfio_ap_mdev_get_locks(matrix_mdev); set_bit_inv(apid, matrix_mdev->matrix.apm); ret = vfio_ap_mdev_validate_masks(matrix_mdev); @@ -671,10 +720,13 @@ static ssize_t assign_adapter_store(struct device *dev, } vfio_ap_mdev_link_adapter(matrix_mdev, apid); - vfio_ap_mdev_filter_matrix(matrix_mdev); + + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + ret = count; done: - mutex_unlock(&matrix_dev->lock); + vfio_ap_mdev_put_locks(matrix_mdev); return ret; } @@ -717,30 +769,23 @@ static ssize_t unassign_adapter_store(struct device *dev, unsigned long apid; struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); - mutex_lock(&matrix_dev->lock); - - /* If the KVM guest is running, disallow unassignment of adapter */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &apid); if (ret) - goto done; + return ret; - if (apid > matrix_mdev->matrix.apm_max) { - ret = -ENODEV; - goto done; - } + if (apid > matrix_mdev->matrix.apm_max) + return -ENODEV; + vfio_ap_mdev_get_locks(matrix_mdev); clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm); vfio_ap_mdev_unlink_adapter(matrix_mdev, apid); - vfio_ap_mdev_filter_matrix(matrix_mdev); - ret = count; -done: - mutex_unlock(&matrix_dev->lock); - return ret; + + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + + vfio_ap_mdev_put_locks(matrix_mdev); + + return count; } static DEVICE_ATTR_WO(unassign_adapter); @@ -793,22 +838,13 @@ static ssize_t assign_domain_store(struct device *dev, struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); unsigned long max_apqi = matrix_mdev->matrix.aqm_max; - mutex_lock(&matrix_dev->lock); - - /* If the KVM guest is running, disallow assignment of domain */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &apqi); if (ret) - goto done; - if (apqi > max_apqi) { - ret = -ENODEV; - goto done; - } + return ret; + if (apqi > max_apqi) + return -ENODEV; + vfio_ap_mdev_get_locks(matrix_mdev); set_bit_inv(apqi, matrix_mdev->matrix.aqm); ret = vfio_ap_mdev_validate_masks(matrix_mdev); @@ -818,10 +854,13 @@ static ssize_t assign_domain_store(struct device *dev, } vfio_ap_mdev_link_domain(matrix_mdev, apqi); - vfio_ap_mdev_filter_matrix(matrix_mdev); + + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + ret = count; done: - mutex_unlock(&matrix_dev->lock); + vfio_ap_mdev_put_locks(matrix_mdev); return ret; } @@ -864,31 +903,23 @@ static ssize_t unassign_domain_store(struct device *dev, unsigned long apqi; struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); - mutex_lock(&matrix_dev->lock); - - /* If the KVM guest is running, disallow unassignment of domain */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &apqi); if (ret) - goto done; + return ret; - if (apqi > matrix_mdev->matrix.aqm_max) { - ret = -ENODEV; - goto done; - } + if (apqi > matrix_mdev->matrix.aqm_max) + return -ENODEV; + vfio_ap_mdev_get_locks(matrix_mdev); clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm); vfio_ap_mdev_unlink_domain(matrix_mdev, apqi); - vfio_ap_mdev_filter_matrix(matrix_mdev); - ret = count; -done: - mutex_unlock(&matrix_dev->lock); - return ret; + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + + vfio_ap_mdev_put_locks(matrix_mdev); + + return count; } static DEVICE_ATTR_WO(unassign_domain); @@ -914,22 +945,14 @@ static ssize_t assign_control_domain_store(struct device *dev, unsigned long id; struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); - mutex_lock(&matrix_dev->lock); - - /* If the KVM guest is running, disallow assignment of control domain */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &id); if (ret) - goto done; + return ret; - if (id > matrix_mdev->matrix.adm_max) { - ret = -ENODEV; - goto done; - } + if (id > matrix_mdev->matrix.adm_max) + return -ENODEV; + + vfio_ap_mdev_get_locks(matrix_mdev); /* Set the bit in the ADM (bitmask) corresponding to the AP control * domain number (id). The bits in the mask, from most significant to @@ -937,11 +960,13 @@ static ssize_t assign_control_domain_store(struct device *dev, * number of control domains that can be assigned. */ set_bit_inv(id, matrix_mdev->matrix.adm); - vfio_ap_mdev_filter_cdoms(matrix_mdev); - ret = count; -done: - mutex_unlock(&matrix_dev->lock); - return ret; + + if (vfio_ap_mdev_filter_cdoms(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + + vfio_ap_mdev_put_locks(matrix_mdev); + + return count; } static DEVICE_ATTR_WO(assign_control_domain); @@ -968,28 +993,21 @@ static ssize_t unassign_control_domain_store(struct device *dev, struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); unsigned long max_domid = matrix_mdev->matrix.adm_max; - mutex_lock(&matrix_dev->lock); - - /* If a KVM guest is running, disallow unassignment of control domain */ - if (matrix_mdev->kvm) { - ret = -EBUSY; - goto done; - } - ret = kstrtoul(buf, 0, &domid); if (ret) - goto done; - if (domid > max_domid) { - ret = -ENODEV; - goto done; - } + return ret; + if (domid > max_domid) + return -ENODEV; + vfio_ap_mdev_get_locks(matrix_mdev); clear_bit_inv(domid, matrix_mdev->matrix.adm); - clear_bit_inv(domid, matrix_mdev->shadow_apcb.adm); - ret = count; -done: - mutex_unlock(&matrix_dev->lock); - return ret; + + if (vfio_ap_mdev_filter_cdoms(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + + vfio_ap_mdev_put_locks(matrix_mdev); + + return count; } static DEVICE_ATTR_WO(unassign_control_domain); @@ -1095,6 +1113,9 @@ static int vfio_ap_mdev_create_guest(struct kvm *kvm, if (!guest) return -ENOMEM; + guest->kvm = kvm; + guest->matrix_mdev = matrix_mdev; + matrix_mdev->guest = guest; list_add(&guest->node, &matrix_dev->guests); return 0; @@ -1123,17 +1144,15 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, int ret; struct ap_matrix_mdev *m; - if (kvm->arch.crypto.crycbd) { - down_write(&matrix_dev->guests_lock); - ret = vfio_ap_mdev_create_guest(kvm, matrix_mdev); - if (WARN_ON(ret)) - return ret; + down_write(&matrix_dev->guests_lock); + if (kvm->arch.crypto.crycbd) { mutex_lock(&kvm->lock); mutex_lock(&matrix_dev->lock); list_for_each_entry(m, &matrix_dev->mdev_list, node) { - if (m != matrix_mdev && m->kvm == kvm) { + if (m != matrix_mdev && m->guest && + m->guest->kvm == kvm) { mutex_unlock(&matrix_dev->lock); mutex_unlock(&kvm->lock); up_write(&matrix_dev->guests_lock); @@ -1141,18 +1160,27 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, } } - kvm_get_kvm(kvm); - matrix_mdev->kvm = kvm; + ret = vfio_ap_mdev_create_guest(kvm, matrix_mdev); + if (WARN_ON(ret)) { + mutex_unlock(&matrix_dev->lock); + mutex_unlock(&kvm->lock); + up_write(&matrix_dev->guests_lock); + return ret; + } + + kvm_get_kvm(matrix_mdev->guest->kvm); kvm->arch.crypto.data = matrix_mdev; - kvm_arch_crypto_set_masks(kvm, matrix_mdev->shadow_apcb.apm, + kvm_arch_crypto_set_masks(matrix_mdev->guest->kvm, + matrix_mdev->shadow_apcb.apm, matrix_mdev->shadow_apcb.aqm, matrix_mdev->shadow_apcb.adm); mutex_unlock(&matrix_dev->lock); - mutex_unlock(&kvm->lock); - up_write(&matrix_dev->guests_lock); + mutex_unlock(&matrix_mdev->guest->kvm->lock); } + up_write(&matrix_dev->guests_lock); + return 0; } @@ -1186,16 +1214,11 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, return NOTIFY_DONE; } -static void vfio_ap_mdev_remove_guest(struct kvm *kvm) +static void vfio_ap_mdev_remove_guest(struct ap_matrix_mdev *matrix_mdev) { - struct ap_guest *guest, *tmp; - - list_for_each_entry_safe(guest, tmp, &matrix_dev->guests, node) { - if (guest->kvm == kvm) { - list_del(&guest->node); - break; - } - } + list_del(&matrix_mdev->guest->node); + matrix_mdev->guest = NULL; + kfree(matrix_mdev->guest); } /** @@ -1203,7 +1226,6 @@ static void vfio_ap_mdev_remove_guest(struct kvm *kvm) * by @matrix_mdev. * * @matrix_mdev: a matrix mediated device - * @kvm: the pointer to the kvm structure being unset. * * Note: The matrix_dev->lock must be taken prior to calling * this function; however, the lock will be temporarily released while the @@ -1212,26 +1234,25 @@ static void vfio_ap_mdev_remove_guest(struct kvm *kvm) * certain circumstances, will result in a circular lock dependency if this is * done under the @matrix_mdev->lock. */ -static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev, - struct kvm *kvm) +static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) { - if (kvm && kvm->arch.crypto.crycbd) { - down_write(&matrix_dev->guests_lock); - vfio_ap_mdev_remove_guest(kvm); + down_write(&matrix_dev->guests_lock); - mutex_lock(&kvm->lock); + if (matrix_mdev->guest) { + mutex_lock(&matrix_mdev->guest->kvm->lock); mutex_lock(&matrix_dev->lock); - kvm_arch_crypto_clear_masks(kvm); + kvm_arch_crypto_clear_masks(matrix_mdev->guest->kvm); vfio_ap_mdev_reset_queues(matrix_mdev); - kvm_put_kvm(kvm); - kvm->arch.crypto.data = NULL; - matrix_mdev->kvm = NULL; + matrix_mdev->guest->kvm->arch.crypto.data = NULL; + kvm_put_kvm(matrix_mdev->guest->kvm); mutex_unlock(&matrix_dev->lock); - mutex_unlock(&kvm->lock); - up_write(&matrix_dev->guests_lock); + mutex_unlock(&matrix_mdev->guest->kvm->lock); + vfio_ap_mdev_remove_guest(matrix_mdev); } + + up_write(&matrix_dev->guests_lock); } static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, @@ -1246,7 +1267,7 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier); if (!data) - vfio_ap_mdev_unset_kvm(matrix_mdev, matrix_mdev->kvm); + vfio_ap_mdev_unset_kvm(matrix_mdev); else if (vfio_ap_mdev_set_kvm(matrix_mdev, data)) notify_rc = NOTIFY_DONE; @@ -1377,7 +1398,7 @@ static void vfio_ap_mdev_close_device(struct vfio_device *vdev) &matrix_mdev->iommu_notifier); vfio_unregister_notifier(vdev->dev, VFIO_GROUP_NOTIFY, &matrix_mdev->group_notifier); - vfio_ap_mdev_unset_kvm(matrix_mdev, matrix_mdev->kvm); + vfio_ap_mdev_unset_kvm(matrix_mdev); } static int vfio_ap_mdev_get_device_info(unsigned long arg) @@ -1504,38 +1525,99 @@ static void vfio_ap_queue_link_mdev(struct vfio_ap_queue *q) } } +/** + * vfio_ap_mdev_get_qlocks: lock all of the locks required for probe/remove + * callbacks. + * + * @apqn: the APQN of the queue device being probed or removed + * + * Return: the struct ap_guest object using the matrix mdev to which @apqn is + * assigned. + */ +static struct ap_guest *vfio_ap_mdev_get_qlocks(int apqn) +{ + struct ap_guest *guest; + unsigned long apid = AP_QID_CARD(apqn); + unsigned long apqi = AP_QID_QUEUE(apqn); + + down_read(&matrix_dev->guests_lock); + + list_for_each_entry(guest, &matrix_dev->guests, node) { + if (test_bit_inv(apid, guest->matrix_mdev->matrix.apm) && + test_bit_inv(apqi, guest->matrix_mdev->matrix.aqm)) { + mutex_lock(&guest->kvm->lock); + mutex_lock(&matrix_dev->lock); + + return guest; + } + } + + mutex_lock(&matrix_dev->lock); + + return NULL; +} + +/** + * vfio_ap_mdev_put_qlocks - unlock all of the locks required for probe/remove + * callbacks. + * + * @guest: the guest using the queue device being probed/removed + */ +static void vfio_ap_mdev_put_qlocks(struct ap_guest *guest) +{ + if (guest) + mutex_unlock(&guest->kvm->lock); + + mutex_unlock(&matrix_dev->lock); + up_read(&matrix_dev->guests_lock); +} + int vfio_ap_mdev_probe_queue(struct ap_device *apdev) { struct vfio_ap_queue *q; + struct ap_guest *guest; + int apqn = to_ap_queue(&apdev->device)->qid; q = kzalloc(sizeof(*q), GFP_KERNEL); if (!q) return -ENOMEM; - mutex_lock(&matrix_dev->lock); - q->apqn = to_ap_queue(&apdev->device)->qid; + q->apqn = apqn; q->saved_isc = VFIO_AP_ISC_INVALID; - vfio_ap_queue_link_mdev(q); - if (q->matrix_mdev) - vfio_ap_mdev_filter_matrix(q->matrix_mdev); + guest = vfio_ap_mdev_get_qlocks(apqn); + + if (guest) { + vfio_ap_mdev_link_queue(guest->matrix_mdev, q); + + if (vfio_ap_mdev_filter_matrix(guest->matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(guest->matrix_mdev); + } else { + vfio_ap_queue_link_mdev(q); + } + dev_set_drvdata(&apdev->device, q); - mutex_unlock(&matrix_dev->lock); + vfio_ap_mdev_put_qlocks(guest); return 0; } void vfio_ap_mdev_remove_queue(struct ap_device *apdev) { + struct ap_guest *guest; struct vfio_ap_queue *q; + int apqn = to_ap_queue(&apdev->device)->qid; - mutex_lock(&matrix_dev->lock); + guest = vfio_ap_mdev_get_qlocks(apqn); q = dev_get_drvdata(&apdev->device); - if (q->matrix_mdev) + if (q->matrix_mdev) { vfio_ap_unlink_queue_fr_mdev(q); + if (vfio_ap_mdev_filter_matrix(q->matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(q->matrix_mdev); + } vfio_ap_mdev_reset_queue(q, 1); dev_set_drvdata(&apdev->device, NULL); + vfio_ap_mdev_put_qlocks(guest); kfree(q); - mutex_unlock(&matrix_dev->lock); } diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 6d28b287d7bf..0e825ffbd0cc 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -28,6 +28,7 @@ struct ap_guest { struct kvm *kvm; + struct ap_matrix_mdev *matrix_mdev; struct list_head node; }; @@ -106,11 +107,9 @@ struct ap_queue_table { * handling the VFIO_GROUP_NOTIFY_SET_KVM event * @iommu_notifier: notifier block used for specifying callback function for * handling the VFIO_IOMMU_NOTIFY_DMA_UNMAP even - * @kvm: the struct holding guest's state - * @pqap_hook: the function pointer to the interception handler for the - * PQAP(AQIC) instruction. * @mdev: the mediated device * @qtable: table of queues (struct vfio_ap_queue) assigned to the mdev + * @guest: the KVM guest using the matrix mdev */ struct ap_matrix_mdev { struct vfio_device vdev; @@ -119,9 +118,9 @@ struct ap_matrix_mdev { struct ap_matrix shadow_apcb; struct notifier_block group_notifier; struct notifier_block iommu_notifier; - struct kvm *kvm; struct mdev_device *mdev; struct ap_queue_table qtable; + struct ap_guest *guest; }; /** From patchwork Thu Oct 21 15:23:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F395C433FE for ; Thu, 21 Oct 2021 15:24:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4DD806128B for ; Thu, 21 Oct 2021 15:24:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232067AbhJUP0j (ORCPT ); Thu, 21 Oct 2021 11:26:39 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:26782 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231984AbhJUP0c (ORCPT ); Thu, 21 Oct 2021 11:26:32 -0400 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LFN031013708; Thu, 21 Oct 2021 11:24:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=sZHANLSp089XwmJ39HHGPasnOPwHzdyAISXjYIE4RsI=; b=moOblz9INYaFHXuabe1gALkjiJU0+boyP15YkqU5L+dxK/KoTiUObJSov9HMlJcyilra qI/qoowHqyyFcuNvmV1PGa1pcOwC7bRoAbWPkRCw4COWTQKwUKmbqiRRRNnDm9cr0wjL sQOdbIlm6I0sPaC4C4qiQW3Lst9lXZ9AjUlm8VU70E6ZhPLNdSGHHtkqjJwjs61dXKLf MsArTRYTeywTLYciIXc3GFpwyl6Frjm3CfpZfAosiXs4XqAruB+fwlECHDSmzcKuc+nP b5NwVimY84KXEdvwGcs9pVUzao+7OaA2lO4H2CYJUXsraoeyKVjh4ki41vaapsLkdeyB tg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3buat480tk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:15 -0400 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFOAGB016063; Thu, 21 Oct 2021 11:24:14 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 3buat480t6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:14 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF2LmB000337; Thu, 21 Oct 2021 15:24:13 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma01dal.us.ibm.com with ESMTP id 3bqpcdr763-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:13 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFOBN629032708 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:11 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2B4EFBE065; Thu, 21 Oct 2021 15:24:11 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5AC5BBE082; Thu, 21 Oct 2021 15:24:09 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:09 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 10/15] s390/vfio-ap: reset queues after adapter/domain unassignment Date: Thu, 21 Oct 2021 11:23:27 -0400 Message-Id: <20211021152332.70455-11-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: CBBaJuzkiGzdd1ylRBLK61omss01zYsi X-Proofpoint-ORIG-GUID: b-wI5zaAzor42pd-UGuGQsjVHM7DXUPU X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 adultscore=0 priorityscore=1501 mlxscore=0 spamscore=0 clxscore=1015 phishscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When an adapter or domain is unassigned from an mdev providing the AP configuration to a running KVM guest, one or more of the guest's queues may get dynamically removed. Since the removed queues could get re-assigned to another mdev, they need to be reset. So, when an adapter or domain is unassigned from the mdev, the queues that are removed from the guest's AP configuration will be reset. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_ops.c | 136 +++++++++++++++++++------- drivers/s390/crypto/vfio_ap_private.h | 2 + 2 files changed, 100 insertions(+), 38 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 5a484e7afbd0..6b292ed30ada 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -24,9 +24,10 @@ #define VFIO_AP_MDEV_TYPE_HWVIRT "passthrough" #define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device" -static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev); +static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable); static struct vfio_ap_queue *vfio_ap_find_queue(int apqn); static const struct vfio_device_ops vfio_ap_matrix_dev_ops; +static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry); /** * vfio_ap_mdev_get_queue - retrieve a queue with a specific APQN from a @@ -352,6 +353,7 @@ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) unsigned long apid, apqi, apqn; DECLARE_BITMAP(shadow_apm, AP_DEVICES); DECLARE_BITMAP(shadow_aqm, AP_DOMAINS); + struct vfio_ap_queue *q; ret = ap_qci(&matrix_dev->info); if (ret) @@ -383,7 +385,8 @@ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) * filter the APQI. */ apqn = AP_MKQID(apid, apqi); - if (!vfio_ap_mdev_get_queue(matrix_mdev, apqn)) { + q = vfio_ap_mdev_get_queue(matrix_mdev, apqn); + if (!q || q->reset_rc) { clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm); break; @@ -466,12 +469,6 @@ static void vfio_ap_unlink_mdev_fr_queue(struct vfio_ap_queue *q) q->matrix_mdev = NULL; } -static void vfio_ap_mdev_unlink_queue(struct vfio_ap_queue *q) -{ - vfio_ap_unlink_queue_fr_mdev(q); - vfio_ap_unlink_mdev_fr_queue(q); -} - static void vfio_ap_mdev_unlink_fr_queues(struct ap_matrix_mdev *matrix_mdev) { struct vfio_ap_queue *q; @@ -495,7 +492,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev) vfio_unregister_group_dev(&matrix_mdev->vdev); mutex_lock(&matrix_dev->lock); - vfio_ap_mdev_reset_queues(matrix_mdev); + vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); vfio_ap_mdev_unlink_fr_queues(matrix_mdev); list_del(&matrix_mdev->node); mutex_unlock(&matrix_dev->lock); @@ -732,17 +729,59 @@ static ssize_t assign_adapter_store(struct device *dev, } static DEVICE_ATTR_WO(assign_adapter); +static void vfio_ap_unlink_apqn_fr_mdev(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid, unsigned long apqi, + struct ap_queue_table *qtable) +{ + struct vfio_ap_queue *q; + + q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + /* If the queue is assigned to the matrix mdev, unlink it. */ + if (q) + vfio_ap_unlink_queue_fr_mdev(q); + + /* If the queue is assigned to the APCB, store it in @qtable. */ + if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && + test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) + hash_add(qtable->queues, &q->mdev_qnode, q->apqn); +} + +/** + * vfio_ap_mdev_unlink_adapter - unlink all queues associated with unassigned + * adapter from the matrix mdev to which the + * adapter was assigned. + * @matrix_mdev: the matrix mediated device to which the adapter was assigned. + * @apid: the APID of the unassigned adapter. + * @qtable: table for storing queues associated with unassigned adapter. + */ static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev, - unsigned long apid) + unsigned long apid, + struct ap_queue_table *qtable) { unsigned long apqi; + + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) + vfio_ap_unlink_apqn_fr_mdev(matrix_mdev, apid, apqi, qtable); +} + +static void vfio_ap_mdev_hot_unplug_adapter(struct ap_matrix_mdev *matrix_mdev, + unsigned long apid) +{ + int bkt; struct vfio_ap_queue *q; + struct ap_queue_table qtable; + + hash_init(qtable.queues); + vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, &qtable); - for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, AP_DOMAINS) { - q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); - if (q) - vfio_ap_mdev_unlink_queue(q); + vfio_ap_mdev_reset_queues(&qtable); + + hash_for_each(qtable.queues, bkt, q, mdev_qnode) { + vfio_ap_unlink_mdev_fr_queue(q); + hash_del(&q->mdev_qnode); } } @@ -778,11 +817,7 @@ static ssize_t unassign_adapter_store(struct device *dev, vfio_ap_mdev_get_locks(matrix_mdev); clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm); - vfio_ap_mdev_unlink_adapter(matrix_mdev, apid); - - if (vfio_ap_mdev_filter_matrix(matrix_mdev)) - vfio_ap_mdev_hotplug_apcb(matrix_mdev); - + vfio_ap_mdev_hot_unplug_adapter(matrix_mdev, apid); vfio_ap_mdev_put_locks(matrix_mdev); return count; @@ -867,16 +902,33 @@ static ssize_t assign_domain_store(struct device *dev, static DEVICE_ATTR_WO(assign_domain); static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev, - unsigned long apqi) + unsigned long apqi, + struct ap_queue_table *qtable) { unsigned long apid; + + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) + vfio_ap_unlink_apqn_fr_mdev(matrix_mdev, apid, apqi, qtable); +} + +static void vfio_ap_mdev_hot_unplug_domain(struct ap_matrix_mdev *matrix_mdev, + unsigned long apqi) +{ + int bkt; struct vfio_ap_queue *q; + struct ap_queue_table qtable; - for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, AP_DEVICES) { - q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); + hash_init(qtable.queues); + vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, &qtable); + + if (vfio_ap_mdev_filter_matrix(matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(matrix_mdev); + + vfio_ap_mdev_reset_queues(&qtable); - if (q) - vfio_ap_mdev_unlink_queue(q); + hash_for_each(qtable.queues, bkt, q, mdev_qnode) { + vfio_ap_unlink_mdev_fr_queue(q); + hash_del(&q->mdev_qnode); } } @@ -912,11 +964,7 @@ static ssize_t unassign_domain_store(struct device *dev, vfio_ap_mdev_get_locks(matrix_mdev); clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm); - vfio_ap_mdev_unlink_domain(matrix_mdev, apqi); - - if (vfio_ap_mdev_filter_matrix(matrix_mdev)) - vfio_ap_mdev_hotplug_apcb(matrix_mdev); - + vfio_ap_mdev_hot_unplug_domain(matrix_mdev, apqi); vfio_ap_mdev_put_locks(matrix_mdev); return count; @@ -1243,7 +1291,7 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) mutex_lock(&matrix_dev->lock); kvm_arch_crypto_clear_masks(matrix_mdev->guest->kvm); - vfio_ap_mdev_reset_queues(matrix_mdev); + vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); matrix_mdev->guest->kvm->arch.crypto.data = NULL; kvm_put_kvm(matrix_mdev->guest->kvm); @@ -1299,12 +1347,14 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) if (!q) return 0; + q->reset_rc = 0; retry_zapq: status = ap_zapq(q->apqn); switch (status.response_code) { case AP_RESPONSE_NORMAL: ret = 0; + q->reset_rc = status.response_code; break; case AP_RESPONSE_RESET_IN_PROGRESS: if (retry--) { @@ -1316,13 +1366,20 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) case AP_RESPONSE_Q_NOT_AVAIL: case AP_RESPONSE_DECONFIGURED: case AP_RESPONSE_CHECKSTOPPED: - WARN_ON_ONCE(status.irq_enabled); + WARN_ONCE(status.irq_enabled, + "PQAP/ZAPQ for %02x.%04x failed with rc=%u while IRQ enabled", + AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), + status.response_code); + q->reset_rc = status.response_code; ret = -EBUSY; goto free_resources; default: /* things are really broken, give up */ - WARN(true, "PQAP/ZAPQ completed with invalid rc (%x)\n", + WARN(true, + "PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n", + AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn), status.response_code); + q->reset_rc = status.response_code; return -EIO; } @@ -1333,7 +1390,8 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) msleep(20); status = ap_tapq(q->apqn, NULL); } - WARN_ON_ONCE(retry2 <= 0); + WARN_ONCE(retry2 <= 0, "unable to verify reset of queue %02x.%04x", + AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn)); free_resources: vfio_ap_free_aqic_resources(q); @@ -1341,20 +1399,22 @@ static int vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q, unsigned int retry) return ret; } -static int vfio_ap_mdev_reset_queues(struct ap_matrix_mdev *matrix_mdev) +static int vfio_ap_mdev_reset_queues(struct ap_queue_table *qtable) { - int ret, bkt, rc = 0; + int rc = 0, ret, bkt; struct vfio_ap_queue *q; - hash_for_each(matrix_mdev->qtable.queues, bkt, q, mdev_qnode) { + hash_for_each(qtable->queues, bkt, q, mdev_qnode) { ret = vfio_ap_mdev_reset_queue(q, 1); /* * Regardless whether a queue turns out to be busy, or * is not operational, we need to continue resetting * the remaining queues. */ - if (ret) + if (ret) { rc = ret; + q->reset_rc = ret; + } } return rc; @@ -1434,7 +1494,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct vfio_device *vdev, ret = vfio_ap_mdev_get_device_info(arg); break; case VFIO_DEVICE_RESET: - ret = vfio_ap_mdev_reset_queues(matrix_mdev); + ret = vfio_ap_mdev_reset_queues(&matrix_mdev->qtable); break; default: ret = -EOPNOTSUPP; diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 0e825ffbd0cc..5d59bba8b153 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -131,6 +131,7 @@ struct ap_matrix_mdev { * @apqn: the APQN of the AP queue device * @saved_isc: the guest ISC registered with the GIB interface * @mdev_qnode: allows the vfio_ap_queue struct to be added to a hashtable + * @reset_rc: the status response code from the last reset of the queue */ struct vfio_ap_queue { struct ap_matrix_mdev *matrix_mdev; @@ -139,6 +140,7 @@ struct vfio_ap_queue { #define VFIO_AP_ISC_INVALID 0xff unsigned char saved_isc; struct hlist_node mdev_qnode; + unsigned int reset_rc; }; int vfio_ap_mdev_register(void); From patchwork Thu Oct 21 15:23:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B28AC433EF for ; Thu, 21 Oct 2021 15:24:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4424F60C49 for ; Thu, 21 Oct 2021 15:24:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232238AbhJUP0p (ORCPT ); Thu, 21 Oct 2021 11:26:45 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:12928 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S232034AbhJUP0e (ORCPT ); Thu, 21 Oct 2021 11:26:34 -0400 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBLXr023564; Thu, 21 Oct 2021 11:24:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=nVtDN4FBRgu+wtOuL4VgwJyxXWxxSGHB4qoSu0HMtfQ=; b=NEw0xRCjPzayTwv/fPdY0AvmbPQ4lrEPaLWEd2DNfbIk4ZSBx/rj+tSMcYjU++W+96zt eVVzeHDZtrKciquYFZdOMi0NnEwqlxZFGiaB9znK9cea6qiy+gCJalRF8WQ8f5RoDSi2 36nhtxI9IOC89GbLS4WZdYghETYAUD2M9ahC/WrY++ym7tTSlTKlQLE54q7TmCmKR3AS qTeboPovuZnHLyMm7oraikXrUD6rAXr2qFNAwMDilkWTfVbltSgrlEuByRxlWRQ8AaNa /MjXGIj2fPTYofD0eEDwX0AguBq7oOlZFEbnBq0z0spLl/zo0B8uT1h3VdmJp5yIE2Vn 6Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu85pm921-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:16 -0400 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEBMrb023679; Thu, 21 Oct 2021 11:24:15 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu85pm91p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:15 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF49lD028056; Thu, 21 Oct 2021 15:24:15 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma05wdc.us.ibm.com with ESMTP id 3bqpccyqeb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:15 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFOD8834603484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:13 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9403ABE08E; Thu, 21 Oct 2021 15:24:13 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6690CBE084; Thu, 21 Oct 2021 15:24:11 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:11 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 11/15] s390/ap: driver callback to indicate resource in use Date: Thu, 21 Oct 2021 11:23:28 -0400 Message-Id: <20211021152332.70455-12-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 5neirPk4Jwhm43LmxkvkSj0h1mWiPfWD X-Proofpoint-GUID: oh7ImjwGjbpPLJY3htkky9vkN_RZpaQs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 suspectscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduces a new driver callback to prevent a root user from re-assigning the APQN of a queue that is in use by a non-default host device driver to a default host device driver and vice versa. The callback will be invoked whenever a change to the AP bus's sysfs apmask or aqmask attributes would result in one or more APQNs being re-assigned. If the callback responds in the affirmative for any driver queried, the change to the apmask or aqmask will be rejected with a device busy error. For this patch, only non-default drivers will be queried. Currently, there is only one non-default driver, the vfio_ap device driver. The vfio_ap device driver facilitates pass-through of an AP queue to a guest. The idea here is that a guest may be administered by a different sysadmin than the host and we don't want AP resources to unexpectedly disappear from a guest's AP configuration (i.e., adapters and domains assigned to the matrix mdev). This will enforce the proper procedure for removing AP resources intended for guest usage which is to first unassign them from the matrix mdev, then unbind them from the vfio_ap device driver. Signed-off-by: Tony Krowiak Reviewed-by: Harald Freudenberger Reviewed-by: Halil Pasic --- drivers/s390/crypto/ap_bus.c | 160 ++++++++++++++++++++++++++++++++--- drivers/s390/crypto/ap_bus.h | 4 + 2 files changed, 154 insertions(+), 10 deletions(-) diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c index d9b804943d19..15886610f61a 100644 --- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -36,6 +36,7 @@ #include #include #include +#include #include "ap_bus.h" #include "ap_debug.h" @@ -1060,6 +1061,23 @@ static int modify_bitmap(const char *str, unsigned long *bitmap, int bits) return 0; } +static int ap_parse_bitmap_str(const char *str, unsigned long *bitmap, int bits, + unsigned long *newmap) +{ + unsigned long size; + int rc; + + size = BITS_TO_LONGS(bits) * sizeof(unsigned long); + if (*str == '+' || *str == '-') { + memcpy(newmap, bitmap, size); + rc = modify_bitmap(str, newmap, bits); + } else { + memset(newmap, 0, size); + rc = hex2bitmap(str, newmap, bits); + } + return rc; +} + int ap_parse_mask_str(const char *str, unsigned long *bitmap, int bits, struct mutex *lock) @@ -1079,14 +1097,7 @@ int ap_parse_mask_str(const char *str, kfree(newmap); return -ERESTARTSYS; } - - if (*str == '+' || *str == '-') { - memcpy(newmap, bitmap, size); - rc = modify_bitmap(str, newmap, bits); - } else { - memset(newmap, 0, size); - rc = hex2bitmap(str, newmap, bits); - } + rc = ap_parse_bitmap_str(str, bitmap, bits, newmap); if (rc == 0) memcpy(bitmap, newmap, size); mutex_unlock(lock); @@ -1278,12 +1289,76 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf) return rc; } +static int __verify_card_reservations(struct device_driver *drv, void *data) +{ + int rc = 0; + struct ap_driver *ap_drv = to_ap_drv(drv); + unsigned long *newapm = (unsigned long *)data; + + /* + * No need to verify whether the driver is using the queues if it is the + * default driver. + */ + if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT) + return 0; + + /* + * increase the driver's module refcounter to be sure it is not + * going away when we invoke the callback function. + */ + if (!try_module_get(drv->owner)) + return 0; + + if (ap_drv->in_use) { + rc = ap_drv->in_use(newapm, ap_perms.aqm); + if (rc) + rc = -EBUSY; + } + + /* release the driver's module */ + module_put(drv->owner); + + return rc; +} + +static int apmask_commit(unsigned long *newapm) +{ + int rc; + unsigned long reserved[BITS_TO_LONGS(AP_DEVICES)]; + + /* + * Check if any bits in the apmask have been set which will + * result in queues being removed from non-default drivers + */ + if (bitmap_andnot(reserved, newapm, ap_perms.apm, AP_DEVICES)) { + rc = bus_for_each_drv(&ap_bus_type, NULL, reserved, + __verify_card_reservations); + if (rc) + return rc; + } + + memcpy(ap_perms.apm, newapm, APMASKSIZE); + + return 0; +} + static ssize_t apmask_store(struct bus_type *bus, const char *buf, size_t count) { int rc; + DECLARE_BITMAP(newapm, AP_DEVICES); - rc = ap_parse_mask_str(buf, ap_perms.apm, AP_DEVICES, &ap_perms_mutex); + if (mutex_lock_interruptible(&ap_perms_mutex)) + return -ERESTARTSYS; + + rc = ap_parse_bitmap_str(buf, ap_perms.apm, AP_DEVICES, newapm); + if (rc) + goto done; + + rc = apmask_commit(newapm); + +done: + mutex_unlock(&ap_perms_mutex); if (rc) return rc; @@ -1309,12 +1384,77 @@ static ssize_t aqmask_show(struct bus_type *bus, char *buf) return rc; } +static int __verify_queue_reservations(struct device_driver *drv, void *data) +{ + int rc = 0; + struct ap_driver *ap_drv = to_ap_drv(drv); + unsigned long *newaqm = (unsigned long *)data; + + /* + * If the reserved bits do not identify queues reserved for use by the + * non-default driver, there is no need to verify the driver is using + * the queues. + */ + if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT) + return 0; + + /* + * increase the driver's module refcounter to be sure it is not + * going away when we invoke the callback function. + */ + if (!try_module_get(drv->owner)) + return 0; + + if (ap_drv->in_use) { + rc = ap_drv->in_use(ap_perms.apm, newaqm); + if (rc) + return -EBUSY; + } + + /* release the driver's module */ + module_put(drv->owner); + + return rc; +} + +static int aqmask_commit(unsigned long *newaqm) +{ + int rc; + unsigned long reserved[BITS_TO_LONGS(AP_DOMAINS)]; + + /* + * Check if any bits in the aqmask have been set which will + * result in queues being removed from non-default drivers + */ + if (bitmap_andnot(reserved, newaqm, ap_perms.aqm, AP_DOMAINS)) { + rc = bus_for_each_drv(&ap_bus_type, NULL, reserved, + __verify_queue_reservations); + if (rc) + return rc; + } + + memcpy(ap_perms.aqm, newaqm, AQMASKSIZE); + + return 0; +} + static ssize_t aqmask_store(struct bus_type *bus, const char *buf, size_t count) { int rc; + DECLARE_BITMAP(newaqm, AP_DOMAINS); + + if (mutex_lock_interruptible(&ap_perms_mutex)) + return -ERESTARTSYS; + + rc = ap_parse_bitmap_str(buf, ap_perms.aqm, AP_DOMAINS, newaqm); + if (rc) + goto done; + + rc = aqmask_commit(newaqm); - rc = ap_parse_mask_str(buf, ap_perms.aqm, AP_DOMAINS, &ap_perms_mutex); +done: + mutex_unlock(&ap_perms_mutex); if (rc) return rc; diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h index 95b577754b35..67c1bef60ad5 100644 --- a/drivers/s390/crypto/ap_bus.h +++ b/drivers/s390/crypto/ap_bus.h @@ -142,6 +142,7 @@ struct ap_driver { int (*probe)(struct ap_device *); void (*remove)(struct ap_device *); + int (*in_use)(unsigned long *apm, unsigned long *aqm); }; #define to_ap_drv(x) container_of((x), struct ap_driver, driver) @@ -289,6 +290,9 @@ void ap_queue_init_state(struct ap_queue *aq); struct ap_card *ap_card_create(int id, int queue_depth, int raw_type, int comp_type, unsigned int functions, int ml); +#define APMASKSIZE (BITS_TO_LONGS(AP_DEVICES) * sizeof(unsigned long)) +#define AQMASKSIZE (BITS_TO_LONGS(AP_DOMAINS) * sizeof(unsigned long)) + struct ap_perms { unsigned long ioctlm[BITS_TO_LONGS(AP_IOCTLS)]; unsigned long apm[BITS_TO_LONGS(AP_DEVICES)]; From patchwork Thu Oct 21 15:23:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A689C433FE for ; Thu, 21 Oct 2021 15:24:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 63FA2611C7 for ; Thu, 21 Oct 2021 15:24:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232082AbhJUP0y (ORCPT ); Thu, 21 Oct 2021 11:26:54 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:31774 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232134AbhJUP0g (ORCPT ); Thu, 21 Oct 2021 11:26:36 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LFI2bS008882; Thu, 21 Oct 2021 11:24:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=SfyWnHUABVEOF5NOrQQSuAsd1mYShw1katQzKV1fRkk=; b=RM2uBB91q//iXR6v8pnXaaxcK57+4L4pIhD4gHk+B50PhG0VogTccHk+PLlnXUg7TOMB ConhXJN+NvVVPAPoGchMESWkyUz27b7qjBndJQkKM7zc31CF/LMcHGkRKHz4GLVwOtwB jdtxl8EDvF8+OcSFNqNmWhPxqe5MdGDq5gD8sfWxQToe/8Klscmd6rMQVIMqqIyXZ5L/ iTww8iebFMpFRKbRYPVieUXYo7bCncvbk6YD39XKZ1pd75MAwoBDWL7a+BPUbyegxIqq TbN1bNc9HsLcvmQ/scYg6Q5zjk+M2mfs3kPJNDN0lHcicYK6IT9YU3vCxCGgdLmqmtYK aA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3buaqs039g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:18 -0400 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFJdbT012252; Thu, 21 Oct 2021 11:24:18 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 3buaqs038x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:17 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF42dv009593; Thu, 21 Oct 2021 15:24:17 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma01wdc.us.ibm.com with ESMTP id 3bqpccfquc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:17 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFOF8s28246354 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:15 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8EC1ABE05B; Thu, 21 Oct 2021 15:24:15 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2364BBE065; Thu, 21 Oct 2021 15:24:14 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:13 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 12/15] s390/vfio-ap: implement in-use callback for vfio_ap driver Date: Thu, 21 Oct 2021 11:23:29 -0400 Message-Id: <20211021152332.70455-13-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: H5jqdUV8NPjUCHnVyKNLfdOszkQYh4yk X-Proofpoint-ORIG-GUID: mzmWfLPrkS_jGuouOC5ABhbpeOVXmnij X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 clxscore=1015 lowpriorityscore=0 bulkscore=0 mlxlogscore=999 impostorscore=0 mlxscore=0 suspectscore=0 adultscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's implement the callback to indicate when an APQN is in use by the vfio_ap device driver. The callback is invoked whenever a change to the apmask or aqmask would result in one or more queue devices being removed from the driver. The vfio_ap device driver will indicate a resource is in use if the APQN of any of the queue devices to be removed are assigned to any of the matrix mdevs under the driver's control. There is potential for a deadlock condition between the matrix_dev->lock used to lock the matrix device during assignment of adapters and domains and the ap_perms_mutex locked by the AP bus when changes are made to the sysfs apmask/aqmask attributes. Consider following scenario (courtesy of Halil Pasic): 1) apmask_store() takes ap_perms_mutex 2) assign_adapter_store() takes matrix_dev->lock 3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries to take matrix_dev->lock 4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv which tries to take ap_perms_mutex BANG! To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock) function to lock the matrix device during assignment of an adapter or domain to a matrix_mdev as well as during the in_use callback, the mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not obtained, then the assignment and in_use functions will terminate with -EAGAIN. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_drv.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 80 ++++++++++++++++++++++++--- drivers/s390/crypto/vfio_ap_private.h | 2 + 3 files changed, 74 insertions(+), 9 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index 1d1746fe50ea..df7528dcf6ed 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -44,6 +44,7 @@ MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids); static struct ap_driver vfio_ap_drv = { .probe = vfio_ap_mdev_probe_queue, .remove = vfio_ap_mdev_remove_queue, + .in_use = vfio_ap_mdev_resource_in_use, .ids = ap_queue_ids, }; diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 6b292ed30ada..5386b8635bec 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -635,16 +635,45 @@ static void vfio_ap_mdev_link_adapter(struct ap_matrix_mdev *matrix_mdev, * vfio_ap_mdev_get_locks - lock the kvm->lock and matrix_dev->lock mutexes * * @matrix_mdev: the matrix mediated device object + * @check_mdev_lock: indicates whether to check that the matrix_dev->lock mutex + * is already locked (true = check, false = do not check). + * + * Return: + * -EAGAIN if the matrix_dev->lock mutex is already locked. + * 0 if both locks were acquired. */ -static void vfio_ap_mdev_get_locks(struct ap_matrix_mdev *matrix_mdev) +static int vfio_ap_mdev_get_locks(struct ap_matrix_mdev *matrix_mdev, + bool check_mdev_lock) { + /* + * If the matrix_dev->lock mutex is to be checked, then there's no + * sense in proceding if it is already locked. + */ + if (check_mdev_lock && mutex_is_locked(&matrix_dev->lock)) + return -EAGAIN; + down_read(&matrix_dev->guests_lock); /* The kvm->lock must be must be taken before the matrix_dev->lock */ if (matrix_mdev->guest) mutex_lock(&matrix_mdev->guest->kvm->lock); - mutex_lock(&matrix_dev->lock); + /* + * If the matrix_dev-> lock is to be checked, then let's try to acquire + * it. If it can't be acquired, then let's bail out and return + * a value indicating locking should be tried again. + */ + if (check_mdev_lock) { + if (!mutex_trylock(&matrix_dev->lock)) { + mutex_unlock(&matrix_mdev->guest->kvm->lock); + up_read(&matrix_dev->guests_lock); + return -EAGAIN; + } + } else { + mutex_lock(&matrix_dev->lock); + } + + return 0; } /** @@ -654,7 +683,6 @@ static void vfio_ap_mdev_get_locks(struct ap_matrix_mdev *matrix_mdev) */ static void vfio_ap_mdev_put_locks(struct ap_matrix_mdev *matrix_mdev) { - /* The kvm->lock must be must be taken before the matrix_dev->lock */ if (matrix_mdev->guest) mutex_unlock(&matrix_mdev->guest->kvm->lock); @@ -691,6 +719,10 @@ static void vfio_ap_mdev_put_locks(struct ap_matrix_mdev *matrix_mdev) * An APQN derived from the cross product of the APID being assigned * and the APQIs previously assigned is being used by another mediated * matrix device + * + * 5. -EAGAIN + * The mdev lock could not be acquired which is required in order to + * change the AP configuration for the mdev */ static ssize_t assign_adapter_store(struct device *dev, struct device_attribute *attr, @@ -707,7 +739,10 @@ static ssize_t assign_adapter_store(struct device *dev, if (apid > matrix_mdev->matrix.apm_max) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, true); + if (ret) + return ret; + set_bit_inv(apid, matrix_mdev->matrix.apm); ret = vfio_ap_mdev_validate_masks(matrix_mdev); @@ -815,7 +850,10 @@ static ssize_t unassign_adapter_store(struct device *dev, if (apid > matrix_mdev->matrix.apm_max) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, false); + if (ret) + return ret; + clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm); vfio_ap_mdev_hot_unplug_adapter(matrix_mdev, apid); vfio_ap_mdev_put_locks(matrix_mdev); @@ -879,7 +917,10 @@ static ssize_t assign_domain_store(struct device *dev, if (apqi > max_apqi) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, true); + if (ret) + return ret; + set_bit_inv(apqi, matrix_mdev->matrix.aqm); ret = vfio_ap_mdev_validate_masks(matrix_mdev); @@ -962,7 +1003,10 @@ static ssize_t unassign_domain_store(struct device *dev, if (apqi > matrix_mdev->matrix.aqm_max) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, false); + if (ret) + return ret; + clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm); vfio_ap_mdev_hot_unplug_domain(matrix_mdev, apqi); vfio_ap_mdev_put_locks(matrix_mdev); @@ -1000,7 +1044,9 @@ static ssize_t assign_control_domain_store(struct device *dev, if (id > matrix_mdev->matrix.adm_max) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, false); + if (ret) + return ret; /* Set the bit in the ADM (bitmask) corresponding to the AP control * domain number (id). The bits in the mask, from most significant to @@ -1047,7 +1093,10 @@ static ssize_t unassign_control_domain_store(struct device *dev, if (domid > max_domid) return -ENODEV; - vfio_ap_mdev_get_locks(matrix_mdev); + ret = vfio_ap_mdev_get_locks(matrix_mdev, false); + if (ret) + return ret; + clear_bit_inv(domid, matrix_mdev->matrix.adm); if (vfio_ap_mdev_filter_cdoms(matrix_mdev)) @@ -1681,3 +1730,16 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev) vfio_ap_mdev_put_qlocks(guest); kfree(q); } + +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm) +{ + int ret; + + if (!mutex_trylock(&matrix_dev->lock)) + return -EBUSY; + + ret = vfio_ap_mdev_verify_no_sharing(apm, aqm); + mutex_unlock(&matrix_dev->lock); + + return ret; +} diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 5d59bba8b153..97da41f87c65 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -149,4 +149,6 @@ void vfio_ap_mdev_unregister(void); int vfio_ap_mdev_probe_queue(struct ap_device *queue); void vfio_ap_mdev_remove_queue(struct ap_device *queue); +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm); + #endif /* _VFIO_AP_PRIVATE_H_ */ From patchwork Thu Oct 21 15:23:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8FD1C4332F for ; Thu, 21 Oct 2021 15:24:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A5470611F2 for ; Thu, 21 Oct 2021 15:24:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231956AbhJUP0z (ORCPT ); Thu, 21 Oct 2021 11:26:55 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:54182 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232069AbhJUP0j (ORCPT ); Thu, 21 Oct 2021 11:26:39 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LFGqnH016438; Thu, 21 Oct 2021 11:24:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=wvX4WsbOX06Hhey7g0fAGHdoujc44vtervt3gIIX2aw=; b=qUZCEso/eB8mOythsDKdEIiBcRUHnq2n6kpjMzJsueh/puDa3t3jjEKMT2Hv0ggZBzPH TZHKHHd9qcE/qLLP5EwUTSlCpEdsKUBPsqLQYG/JiLjU4bHFMMXqHBWqdVnfDGt2hv/V 7nRdBWUId6/z6oi1/elxxSMSgw1nSh58K73zvyThyjFroPzmNILRt2qNF3uouHKeeBDY BDDeA/r9jvJHn+P3FsxJKNpJHeaaaiL3VeqHhuAd/V9Ih0HPF4Y7YbFg9Fd6emlb/Foq meHX+Z8wX7p2FewF4EIqx9FlqrD7N2F0yGxm6u22kq8buIAdPqn5zy8quKiUAC3mBSIP Ng== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9fmtch4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:21 -0400 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LFFHYW012236; Thu, 21 Oct 2021 11:24:20 -0400 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9fmtcgt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:20 -0400 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF427h018888; Thu, 21 Oct 2021 15:24:20 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma02wdc.us.ibm.com with ESMTP id 3bqpccfspf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:19 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFOIXb45089256 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:18 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 729F6BE09D; Thu, 21 Oct 2021 15:24:18 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C49BABE07A; Thu, 21 Oct 2021 15:24:15 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:15 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 13/15] s390/vfio-ap: sysfs attribute to display the guest's matrix Date: Thu, 21 Oct 2021 11:23:30 -0400 Message-Id: <20211021152332.70455-14-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: gglYFe4sOFR9sAmxWk7EZJg4m9EcpV8X X-Proofpoint-GUID: UHKKc1aLj2Bm_A7cFFZazAjAh_jaaQUK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 adultscore=0 clxscore=1015 suspectscore=0 malwarescore=0 spamscore=0 bulkscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The matrix of adapters and domains configured in a guest's APCB may differ from the matrix of adapters and domains assigned to the matrix mdev, so this patch introduces a sysfs attribute to display the matrix of adapters and domains that are or will be assigned to the APCB of a guest that is or will be using the matrix mdev. For a matrix mdev denoted by $uuid, the guest matrix can be displayed as follows: cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_ops.c | 50 +++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 5386b8635bec..8075080ef2dd 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1131,28 +1131,24 @@ static ssize_t control_domains_show(struct device *dev, } static DEVICE_ATTR_RO(control_domains); -static ssize_t matrix_show(struct device *dev, struct device_attribute *attr, - char *buf) +static ssize_t vfio_ap_mdev_matrix_show(struct ap_matrix *matrix, char *buf) { - struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev); char *bufpos = buf; unsigned long apid; unsigned long apqi; unsigned long apid1; unsigned long apqi1; - unsigned long napm_bits = matrix_mdev->matrix.apm_max + 1; - unsigned long naqm_bits = matrix_mdev->matrix.aqm_max + 1; + unsigned long napm_bits = matrix->apm_max + 1; + unsigned long naqm_bits = matrix->aqm_max + 1; int nchars = 0; int n; - apid1 = find_first_bit_inv(matrix_mdev->matrix.apm, napm_bits); - apqi1 = find_first_bit_inv(matrix_mdev->matrix.aqm, naqm_bits); - - mutex_lock(&matrix_dev->lock); + apid1 = find_first_bit_inv(matrix->apm, napm_bits); + apqi1 = find_first_bit_inv(matrix->aqm, naqm_bits); if ((apid1 < napm_bits) && (apqi1 < naqm_bits)) { - for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, napm_bits) { - for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, + for_each_set_bit_inv(apid, matrix->apm, napm_bits) { + for_each_set_bit_inv(apqi, matrix->aqm, naqm_bits) { n = sprintf(bufpos, "%02lx.%04lx\n", apid, apqi); @@ -1161,25 +1157,52 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr, } } } else if (apid1 < napm_bits) { - for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, napm_bits) { + for_each_set_bit_inv(apid, matrix->apm, napm_bits) { n = sprintf(bufpos, "%02lx.\n", apid); bufpos += n; nchars += n; } } else if (apqi1 < naqm_bits) { - for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, naqm_bits) { + for_each_set_bit_inv(apqi, matrix->aqm, naqm_bits) { n = sprintf(bufpos, ".%04lx\n", apqi); bufpos += n; nchars += n; } } + return nchars; +} + +static ssize_t matrix_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + ssize_t nchars; + struct mdev_device *mdev = mdev_from_dev(dev); + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); + + mutex_lock(&matrix_dev->lock); + nchars = vfio_ap_mdev_matrix_show(&matrix_mdev->matrix, buf); mutex_unlock(&matrix_dev->lock); return nchars; } static DEVICE_ATTR_RO(matrix); +static ssize_t guest_matrix_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + ssize_t nchars; + struct mdev_device *mdev = mdev_from_dev(dev); + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); + + mutex_lock(&matrix_dev->lock); + nchars = vfio_ap_mdev_matrix_show(&matrix_mdev->shadow_apcb, buf); + mutex_unlock(&matrix_dev->lock); + + return nchars; +} +static DEVICE_ATTR_RO(guest_matrix); + static struct attribute *vfio_ap_mdev_attrs[] = { &dev_attr_assign_adapter.attr, &dev_attr_unassign_adapter.attr, @@ -1189,6 +1212,7 @@ static struct attribute *vfio_ap_mdev_attrs[] = { &dev_attr_unassign_control_domain.attr, &dev_attr_control_domains.attr, &dev_attr_matrix.attr, + &dev_attr_guest_matrix.attr, NULL, }; From patchwork Thu Oct 21 15:23:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACB20C433F5 for ; Thu, 21 Oct 2021 15:24:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 993CE6137C for ; Thu, 21 Oct 2021 15:24:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232135AbhJUP1D (ORCPT ); Thu, 21 Oct 2021 11:27:03 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:55096 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232221AbhJUP0o (ORCPT ); Thu, 21 Oct 2021 11:26:44 -0400 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEOZxu021348; Thu, 21 Oct 2021 11:24:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=UK58jJkoMhdAgrXXtTr1/C7Y6We0PrFiCLEHqbyp5Do=; b=sNUg5UOvIlWgAxqUW7gjpAxrABByA/uBQJOe8SdXSMgVa+GMWxyID56M93b4XJ0/qtGt Hfr0oL8FVl8h6dhTeX1Jnck4SMz/1oRRa8ITETKAw77tNmk7Qa2vyRt6SzvbbnmRDICT 85fqosCL6xz1+ZgCfl4JV6R9ev+NXJvJp3YKE17g93ztygvHuCMorPEp1EHzdSkvIKvO tufBn9yLtEk6RvMm8wQHCZ8k+pPT/FBHv2omOxofz/Jl2C58YllakXJ/lGrgpYp1WDdF n+8N1h5vf2E3gbohbsxlrIUv7jSwpWsDROZvxhelZmBEIRPzuRPWFZVobVHDRFb6jVyD Kw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9xr1fb5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:26 -0400 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LEo5KR016964; Thu, 21 Oct 2021 11:24:26 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bu9xr1fas-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:26 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF2Li3000334; Thu, 21 Oct 2021 15:24:25 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma01dal.us.ibm.com with ESMTP id 3bqpcdr7by-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:25 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFONJI31326966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:23 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D9253BE08A; Thu, 21 Oct 2021 15:24:22 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AB6C7BE082; Thu, 21 Oct 2021 15:24:18 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:18 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 14/15] s390/ap: notify drivers on config changed and scan complete callbacks Date: Thu, 21 Oct 2021 11:23:31 -0400 Message-Id: <20211021152332.70455-15-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 2nUfUTr1uUETixJl29OQtZMm7VmFMflL X-Proofpoint-ORIG-GUID: nerLRaDjDAYZ1IXkheo5WoyLR0EdUWUN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 impostorscore=0 bulkscore=0 clxscore=1015 phishscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 spamscore=0 lowpriorityscore=0 priorityscore=1501 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch introduces an extension to the ap bus to notify device drivers when the host AP configuration changes - i.e., adapters, domains or control domains are added or removed. When an adapter or domain is added to the host's AP configuration, the AP bus will create the associated queue devices in the linux sysfs device model. Each new type 10 (i.e., CEX4) or newer queue device with an APQN that is not reserved for the default device driver will get bound to the vfio_ap device driver. Likewise, whan an adapter or domain is removed from the host's AP configuration, the AP bus will remove the associated queue devices from the sysfs device model. Each of the queues that is bound to the vfio_ap device driver will get unbound. With the introduction of hot plug support, binding or unbinding of a queue device will result in plugging or unplugging one or more queues from a guest that is using the queue. If there are multiple changes to the host's AP configuration, it could result in the probe and remove callbacks getting invoked multiple times. Each time queues are plugged into or unplugged from a guest, the guest's VCPUs must be taken out of SIE. If this occurs multiple times due to changes in the host's AP configuration, that can have an undesirable negative affect on the guest's performance. To alleviate this problem, this patch introduces two new callbacks: one to notify the vfio_ap device driver when the AP bus scan routine detects a change to the host's AP configuration; and, one to notify the driver when the AP bus is done scanning. This will allow the vfio_ap driver to do bulk processing of all affected adapters, domains and control domains for affected guests rather than plugging or unplugging them one at a time when the probe or remove callback is invoked. The two new callbacks are: void (*on_config_changed)(struct ap_config_info *new_config_info, struct ap_config_info *old_config_info); This callback is invoked at the start of the AP bus scan function when it determines that the host AP configuration information has changed since the previous scan. This is done by storing an old and current QCI info struct and comparing them. If there is any difference, the callback is invoked. The vfio_ap device driver registers a callback function for this callback that performs the following operations: 1. Unplugs the adapters, domains and control domains removed from the host's AP configuration from the guests to which they are assigned in a single operation. 2. Disconnects the links between each queue structure representing a queue that was unplugged from the structure representing the mediated device to which the queue is assigned. Thus, when the vfio_ap device driver's remove callback is invoked, the unplugging of the queue from the guest and the unlinking of the queue structure from the mediated device structure will be bypassed because the queues and control domains will have already been unplugged in bulk. 3. Stores bitmaps identifying the adapters, domains and control domains added to the host's AP configuration with the structure representing the mediated device. When the vfio_ap device driver's probe callback is subsequently invoked, the probe function will recognize that the queue is being probed due to a change in the host's AP configuration and the plugging of the queue into the guest will be bypassed. void (*on_scan_complete)(struct ap_config_info *new_config_info, struct ap_config_info *old_config_info); The on_scan_complete callback is invoked after the ap bus scan is completed if the host AP configuration data has changed. The vfio_ap device driver registers a callback function for this callback that hot plugs each queue and control domain added to the AP configuration for each guest using them in a single hot plug operation. Signed-off-by: Harald Freudenberger [akrowiak@linux.ibm.com: implemented callback functions in vfio_ap driver] Signed-off-by: Tony Krowiak --- drivers/s390/crypto/ap_bus.c | 81 ++++++- drivers/s390/crypto/ap_bus.h | 12 + drivers/s390/crypto/vfio_ap_drv.c | 4 +- drivers/s390/crypto/vfio_ap_ops.c | 332 ++++++++++++++++++++++++-- drivers/s390/crypto/vfio_ap_private.h | 23 +- 5 files changed, 429 insertions(+), 23 deletions(-) diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c index 15886610f61a..b97149d02da6 100644 --- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -88,6 +88,7 @@ static atomic64_t ap_bindings_complete_count = ATOMIC64_INIT(0); static DECLARE_COMPLETION(ap_init_apqn_bindings_complete); static struct ap_config_info *ap_qci_info; +static struct ap_config_info *ap_qci_info_old; /* * AP bus related debug feature things. @@ -225,9 +226,14 @@ static void __init ap_init_qci_info(void) ap_qci_info = kzalloc(sizeof(*ap_qci_info), GFP_KERNEL); if (!ap_qci_info) return; + ap_qci_info_old = kzalloc(sizeof(*ap_qci_info_old), GFP_KERNEL); + if (!ap_qci_info_old) + return; if (ap_fetch_qci_info(ap_qci_info) != 0) { kfree(ap_qci_info); + kfree(ap_qci_info_old); ap_qci_info = NULL; + ap_qci_info_old = NULL; return; } AP_DBF_INFO("%s successful fetched initial qci info\n", __func__); @@ -244,6 +250,8 @@ static void __init ap_init_qci_info(void) __func__, ap_max_domain_id); } } + + memcpy(ap_qci_info_old, ap_qci_info, sizeof(*ap_qci_info)); } /* @@ -1635,6 +1643,49 @@ static int __match_queue_device_with_queue_id(struct device *dev, const void *da && AP_QID_QUEUE(to_ap_queue(dev)->qid) == (int)(long) data; } +/* Helper function for notify_config_changed */ +static int __drv_notify_config_changed(struct device_driver *drv, void *data) +{ + struct ap_driver *ap_drv = to_ap_drv(drv); + + if (try_module_get(drv->owner)) { + if (ap_drv->on_config_changed) + ap_drv->on_config_changed(ap_qci_info, ap_qci_info_old); + module_put(drv->owner); + } + + return 0; +} + +/* Notify all drivers about an qci config change */ +static inline void notify_config_changed(void) +{ + bus_for_each_drv(&ap_bus_type, NULL, NULL, + __drv_notify_config_changed); +} + +/* Helper function for notify_scan_complete */ +static int __drv_notify_scan_complete(struct device_driver *drv, void *data) +{ + struct ap_driver *ap_drv = to_ap_drv(drv); + + if (try_module_get(drv->owner)) { + if (ap_drv->on_scan_complete) + ap_drv->on_scan_complete(ap_qci_info, + ap_qci_info_old); + module_put(drv->owner); + } + + return 0; +} + +/* Notify all drivers about bus scan complete */ +static inline void notify_scan_complete(void) +{ + bus_for_each_drv(&ap_bus_type, NULL, NULL, + __drv_notify_scan_complete); +} + /* * Helper function for ap_scan_bus(). * Remove card device and associated queue devices. @@ -1923,6 +1974,25 @@ static inline void ap_scan_adapter(int ap) put_device(&ac->ap_dev.device); } +/** + * ap_get_configuration - get the host AP configuration + * + * Stores the host AP configuration information returned from the previous call + * to Query Configuration Information (QCI), then retrieves and stores the + * current AP configuration returned from QCI. + * + * Return: true if the host AP configuration changed between calls to QCI; + * otherwise, return false. + */ +static bool ap_get_configuration(void) +{ + memcpy(ap_qci_info_old, ap_qci_info, sizeof(*ap_qci_info)); + ap_fetch_qci_info(ap_qci_info); + + return memcmp(ap_qci_info, ap_qci_info_old, + sizeof(struct ap_config_info)) != 0; +} + /** * ap_scan_bus(): Scan the AP bus for new devices * Runs periodically, workqueue timer (ap_config_time) @@ -1930,9 +2000,12 @@ static inline void ap_scan_adapter(int ap) */ static void ap_scan_bus(struct work_struct *unused) { - int ap; + int ap, config_changed = 0; - ap_fetch_qci_info(ap_qci_info); + /* config change notify */ + config_changed = ap_get_configuration(); + if (config_changed) + notify_config_changed(); ap_select_domain(); AP_DBF_DBG("%s running\n", __func__); @@ -1941,6 +2014,10 @@ static void ap_scan_bus(struct work_struct *unused) for (ap = 0; ap <= ap_max_adapter_id; ap++) ap_scan_adapter(ap); + /* scan complete notify */ + if (config_changed) + notify_scan_complete(); + /* check if there is at least one queue available with default domain */ if (ap_domain_index >= 0) { struct device *dev = diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h index 67c1bef60ad5..4de062ea6b76 100644 --- a/drivers/s390/crypto/ap_bus.h +++ b/drivers/s390/crypto/ap_bus.h @@ -143,6 +143,18 @@ struct ap_driver { int (*probe)(struct ap_device *); void (*remove)(struct ap_device *); int (*in_use)(unsigned long *apm, unsigned long *aqm); + /* + * Called at the start of the ap bus scan function when + * the crypto config information (qci) has changed. + */ + void (*on_config_changed)(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info); + /* + * Called at the end of the ap bus scan function when + * the crypto config information (qci) has changed. + */ + void (*on_scan_complete)(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info); }; #define to_ap_drv(x) container_of((x), struct ap_driver, driver) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index df7528dcf6ed..5edd45d4d2fc 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -45,6 +45,8 @@ static struct ap_driver vfio_ap_drv = { .probe = vfio_ap_mdev_probe_queue, .remove = vfio_ap_mdev_remove_queue, .in_use = vfio_ap_mdev_resource_in_use, + .on_config_changed = vfio_ap_on_cfg_changed, + .on_scan_complete = vfio_ap_on_scan_complete, .ids = ap_queue_ids, }; @@ -92,7 +94,7 @@ static int vfio_ap_matrix_dev_create(void) /* Fill in config info via PQAP(QCI), if available */ if (test_facility(12)) { - ret = ap_qci(&matrix_dev->info); + ret = ap_qci(&matrix_dev->config_info); if (ret) goto matrix_alloc_err; } diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 8075080ef2dd..cedf491c0df4 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -330,7 +330,7 @@ static bool vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) bitmap_copy(shadow_adm, matrix_mdev->shadow_apcb.adm, AP_DOMAINS); bitmap_and(matrix_mdev->shadow_apcb.adm, matrix_mdev->matrix.adm, - (unsigned long *)matrix_dev->info.adm, AP_DOMAINS); + (unsigned long *)matrix_dev->config_info.adm, AP_DOMAINS); return !bitmap_equal(shadow_adm, matrix_mdev->shadow_apcb.adm, AP_DOMAINS); @@ -349,19 +349,15 @@ static bool vfio_ap_mdev_filter_cdoms(struct ap_matrix_mdev *matrix_mdev) */ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) { - int ret; unsigned long apid, apqi, apqn; DECLARE_BITMAP(shadow_apm, AP_DEVICES); DECLARE_BITMAP(shadow_aqm, AP_DOMAINS); struct vfio_ap_queue *q; - ret = ap_qci(&matrix_dev->info); - if (ret) - return false; - bitmap_copy(shadow_apm, matrix_mdev->shadow_apcb.apm, AP_DEVICES); bitmap_copy(shadow_aqm, matrix_mdev->shadow_apcb.aqm, AP_DOMAINS); - vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); + vfio_ap_matrix_init(&matrix_dev->config_info, + &matrix_mdev->shadow_apcb); /* * Copy the adapters, domains and control domains to the shadow_apcb @@ -369,9 +365,9 @@ static bool vfio_ap_mdev_filter_matrix(struct ap_matrix_mdev *matrix_mdev) * AP configuration. */ bitmap_and(matrix_mdev->shadow_apcb.apm, matrix_mdev->matrix.apm, - (unsigned long *)matrix_dev->info.apm, AP_DEVICES); + (unsigned long *)matrix_dev->config_info.apm, AP_DEVICES); bitmap_and(matrix_mdev->shadow_apcb.aqm, matrix_mdev->matrix.aqm, - (unsigned long *)matrix_dev->info.aqm, AP_DOMAINS); + (unsigned long *)matrix_dev->config_info.aqm, AP_DOMAINS); for_each_set_bit_inv(apid, matrix_mdev->shadow_apcb.apm, AP_DEVICES) { for_each_set_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm, @@ -417,8 +413,9 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev) &vfio_ap_matrix_dev_ops); matrix_mdev->mdev = mdev; - vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix); - vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->shadow_apcb); + vfio_ap_matrix_init(&matrix_dev->config_info, &matrix_mdev->matrix); + vfio_ap_matrix_init(&matrix_dev->config_info, + &matrix_mdev->shadow_apcb); hash_init(matrix_mdev->qtable.queues); mdev_set_drvdata(mdev, matrix_mdev); mutex_lock(&matrix_dev->lock); @@ -772,13 +769,17 @@ static void vfio_ap_unlink_apqn_fr_mdev(struct ap_matrix_mdev *matrix_mdev, q = vfio_ap_mdev_get_queue(matrix_mdev, AP_MKQID(apid, apqi)); /* If the queue is assigned to the matrix mdev, unlink it. */ - if (q) + if (q) { vfio_ap_unlink_queue_fr_mdev(q); - /* If the queue is assigned to the APCB, store it in @qtable. */ - if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && - test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) - hash_add(qtable->queues, &q->mdev_qnode, q->apqn); + /* If the queue is assigned to the APCB, store it in @qtable. */ + if (qtable) { + if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm) && + test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) + hash_add(qtable->queues, &q->mdev_qnode, + q->apqn); + } + } } /** @@ -1702,9 +1703,31 @@ static void vfio_ap_mdev_put_qlocks(struct ap_guest *guest) mutex_unlock(&guest->kvm->lock); mutex_unlock(&matrix_dev->lock); + up_read(&matrix_dev->guests_lock); } +static bool vfio_ap_mdev_do_filter_matrix(struct ap_matrix_mdev *matrix_mdev, + struct vfio_ap_queue *q) +{ + unsigned long apid = AP_QID_CARD(q->apqn); + unsigned long apqi = AP_QID_QUEUE(q->apqn); + + /* + * If the queue is being probed because its APID or APQI is in the + * process of being added to the host's AP configuration, then we don't + * want to filter the matrix now as the filtering will be done after + * the driver is notified that the AP bus scan operation has completed + * (see the vfio_ap_on_scan_complete callback function). + */ + if (test_bit_inv(apid, matrix_mdev->apm_add) || + test_bit_inv(apqi, matrix_mdev->aqm_add)) + return false; + + + return true; +} + int vfio_ap_mdev_probe_queue(struct ap_device *apdev) { struct vfio_ap_queue *q; @@ -1722,8 +1745,10 @@ int vfio_ap_mdev_probe_queue(struct ap_device *apdev) if (guest) { vfio_ap_mdev_link_queue(guest->matrix_mdev, q); - if (vfio_ap_mdev_filter_matrix(guest->matrix_mdev)) - vfio_ap_mdev_hotplug_apcb(guest->matrix_mdev); + if (vfio_ap_mdev_do_filter_matrix(guest->matrix_mdev, q)) { + if (vfio_ap_mdev_filter_matrix(guest->matrix_mdev)) + vfio_ap_mdev_hotplug_apcb(guest->matrix_mdev); + } } else { vfio_ap_queue_link_mdev(q); } @@ -1767,3 +1792,274 @@ int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm) return ret; } + +/** + * vfio_ap_mdev_unlink_adapters - unlinks all queues from the matrix mdev with + * an APQI of a domain that has been removed from + * the host's AP configuration. + * + * @matrix_mdev: the matrix mdev from which the queues are to be unlinked + * @ap_unlink: a bitmap specifying the APIDs of the adapters removed from the + * host's AP configuration. + */ +static void vfio_ap_mdev_unlink_adapters(struct ap_matrix_mdev *matrix_mdev, + unsigned long *ap_unlink) +{ + unsigned long apid; + + for_each_set_bit_inv(apid, ap_unlink, AP_DEVICES) + vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, NULL); +} + +/** + * vfio_ap_mdev_unlink_domains - unlinks all queues from the matrix mdev with an + * APQI of a domain that has been removed from the + * host's AP configuration. + * + * @matrix_mdev: the matrix mdev from which the queues are to be unlinked + * @aq_unlink: a bitmap specifying the APQIs of the domains removed from the + * host's AP configuration. + */ +static void vfio_ap_mdev_unlink_domains(struct ap_matrix_mdev *matrix_mdev, + unsigned long *aq_unlink) +{ + unsigned long apqi; + + for_each_set_bit_inv(apqi, aq_unlink, AP_DOMAINS) + vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, NULL); +} + +/** + * vfio_ap_mdev_hot_unplug_cfg - hot unplug the adapters, domains and control + * domains that have been removed from the host's + * AP configuration from a guest. + * + * @guest: the guest + * @aprem: the adapters that have been removed from the host's AP configuration + * @aqrem: the domains that have been removed from the host's AP configuration + */ +static void vfio_ap_mdev_hot_unplug_cfg(struct ap_guest *guest, + unsigned long *aprem, + unsigned long *aqrem) +{ + vfio_ap_mdev_unlink_adapters(guest->matrix_mdev, aprem); + vfio_ap_mdev_unlink_domains(guest->matrix_mdev, aqrem); + + if (vfio_ap_mdev_filter_matrix(guest->matrix_mdev) || + vfio_ap_mdev_filter_cdoms(guest->matrix_mdev)) { + mutex_lock(&guest->kvm->lock); + mutex_lock(&matrix_dev->lock); + + vfio_ap_mdev_hotplug_apcb(guest->matrix_mdev); + + mutex_unlock(&guest->kvm->lock); + mutex_unlock(&matrix_dev->lock); + } +} + +/** + * vfio_ap_mdev_cfg_remove - determines which guests are using the adapters, + * domains and control domains that have been removed + * from the host AP configuration and unplugs them + * from those guests. + * + * @ap_remove: bitmap specifying which adapters have been removed from the host + * config. + * @aq_remove: bitmap specifying which domains have been removed from the host + * config. + * @cd_remove: bitmap specifying which control domains have been removed from + * the host config. + */ +static void vfio_ap_mdev_cfg_remove(unsigned long *ap_remove, + unsigned long *aq_remove, + unsigned long *cd_remove) +{ + struct ap_guest *guest; + DECLARE_BITMAP(aprem, AP_DEVICES); + DECLARE_BITMAP(aqrem, AP_DOMAINS); + int do_ap_remove, do_aq_remove, do_cd_remove; + + list_for_each_entry(guest, &matrix_dev->guests, node) { + do_ap_remove = bitmap_and(aprem, ap_remove, + guest->matrix_mdev->matrix.apm, + AP_DEVICES); + do_aq_remove = bitmap_and(aqrem, aq_remove, + guest->matrix_mdev->matrix.aqm, + AP_DOMAINS); + do_cd_remove = bitmap_and(aqrem, cd_remove, + guest->matrix_mdev->matrix.aqm, + AP_DOMAINS); + + if (!do_ap_remove && !do_aq_remove && !do_cd_remove) + continue; + + vfio_ap_mdev_hot_unplug_cfg(guest, aprem, aqrem); + } +} + +/** + * vfio_ap_mdev_on_cfg_remove - responds to the removal of adapters, domains and + * control domains from the host AP configuration + * by unplugging them from the guests that are + * using them. + */ +static void vfio_ap_mdev_on_cfg_remove(void) +{ + int ap_remove, aq_remove, cd_remove; + DECLARE_BITMAP(aprem, AP_DEVICES); + DECLARE_BITMAP(aqrem, AP_DOMAINS); + DECLARE_BITMAP(cdrem, AP_DOMAINS); + unsigned long *cur_apm, *cur_aqm, *cur_adm; + unsigned long *prev_apm, *prev_aqm, *prev_adm; + + cur_apm = (unsigned long *)matrix_dev->config_info.apm; + cur_aqm = (unsigned long *)matrix_dev->config_info.aqm; + cur_adm = (unsigned long *)matrix_dev->config_info.adm; + prev_apm = (unsigned long *)matrix_dev->config_info_prev.apm; + prev_aqm = (unsigned long *)matrix_dev->config_info_prev.aqm; + prev_adm = (unsigned long *)matrix_dev->config_info_prev.adm; + + ap_remove = bitmap_andnot(aprem, prev_apm, cur_apm, AP_DEVICES); + aq_remove = bitmap_andnot(aqrem, prev_aqm, cur_aqm, AP_DOMAINS); + cd_remove = bitmap_andnot(cdrem, prev_adm, cur_adm, AP_DOMAINS); + + if (ap_remove || aq_remove || cd_remove) + vfio_ap_mdev_cfg_remove(aprem, aqrem, cdrem); +} + +/** + * vfio_ap_mdev_cfg_add - store bitmaps specifying the adapters, domains and + * control domains that have been added to the host's + * AP configuration for each matrix mdev to which they + * are assigned. + * + * @apm_add: a bitmap specifying the adapters that have been added to the AP + * configuration. + * @aqm_add: a bitmap specifying the domains that have been added to the AP + * configuration. + * @adm_add: a bitmap specifying the control domains that have been added to the + * AP configuration. + */ +static void vfio_ap_mdev_cfg_add(unsigned long *apm_add, unsigned long *aqm_add, + unsigned long *adm_add) +{ + struct ap_guest *guest; + + list_for_each_entry(guest, &matrix_dev->guests, node) { + bitmap_and(guest->matrix_mdev->apm_add, + guest->matrix_mdev->matrix.apm, apm_add, AP_DEVICES); + bitmap_and(guest->matrix_mdev->aqm_add, + guest->matrix_mdev->matrix.aqm, aqm_add, AP_DOMAINS); + bitmap_and(guest->matrix_mdev->adm_add, + guest->matrix_mdev->matrix.adm, adm_add, AP_DEVICES); + } +} + +/** + * vfio_ap_mdev_on_cfg_add - responds to the addition of adapters, domains and + * control domains to the host AP configuration + * by updating the bitmaps that specify what adapters, + * domains and control domains have been added so they + * can be hot plugged into the guest when the AP bus + * scan completes (see vfio_ap_on_scan_complete + * function). + */ +static void vfio_ap_mdev_on_cfg_add(void) +{ + bool do_add; + DECLARE_BITMAP(apm_add, AP_DEVICES); + DECLARE_BITMAP(aqm_add, AP_DOMAINS); + DECLARE_BITMAP(adm_add, AP_DOMAINS); + unsigned long *cur_apm, *cur_aqm, *cur_adm; + unsigned long *prev_apm, *prev_aqm, *prev_adm; + + cur_apm = (unsigned long *)matrix_dev->config_info.apm; + cur_aqm = (unsigned long *)matrix_dev->config_info.aqm; + cur_adm = (unsigned long *)matrix_dev->config_info.adm; + + prev_apm = (unsigned long *)matrix_dev->config_info_prev.apm; + prev_aqm = (unsigned long *)matrix_dev->config_info_prev.aqm; + prev_adm = (unsigned long *)matrix_dev->config_info_prev.adm; + + do_add = bitmap_andnot(apm_add, cur_apm, prev_apm, AP_DEVICES); + do_add |= bitmap_andnot(aqm_add, cur_aqm, prev_aqm, AP_DOMAINS); + do_add |= bitmap_andnot(adm_add, cur_adm, prev_adm, AP_DOMAINS); + + if (do_add) + vfio_ap_mdev_cfg_add(apm_add, aqm_add, adm_add); +} + +/** + * vfio_ap_on_cfg_changed - handles notification of changes to the host AP + * configuration. + * + * @new_config_info: the new host AP configuration + * @old_config_info: the previous host AP configuration + */ +void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info) +{ + down_read(&matrix_dev->guests_lock); + + memcpy(&matrix_dev->config_info_prev, old_config_info, + sizeof(struct ap_config_info)); + memcpy(&matrix_dev->config_info, new_config_info, + sizeof(struct ap_config_info)); + vfio_ap_mdev_on_cfg_remove(); + vfio_ap_mdev_on_cfg_add(); + + up_read(&matrix_dev->guests_lock); +} + +static void vfio_ap_mdev_hot_plug_cfg(struct ap_guest *guest) +{ + bool filter_matrix, filter_cdoms, do_hotplug = false; + + filter_matrix = bitmap_intersects(guest->matrix_mdev->matrix.apm, + guest->matrix_mdev->apm_add, + AP_DEVICES) || + bitmap_intersects(guest->matrix_mdev->matrix.aqm, + guest->matrix_mdev->aqm_add, + AP_DOMAINS); + + filter_cdoms = bitmap_intersects(guest->matrix_mdev->matrix.adm, + guest->matrix_mdev->aqm_add, + AP_DOMAINS); + + mutex_lock(&guest->kvm->lock); + mutex_lock(&matrix_dev->lock); + + if (filter_matrix) + do_hotplug |= vfio_ap_mdev_filter_matrix(guest->matrix_mdev); + + if (filter_cdoms) + do_hotplug |= vfio_ap_mdev_filter_cdoms(guest->matrix_mdev); + + if (do_hotplug) + vfio_ap_mdev_hotplug_apcb(guest->matrix_mdev); + + mutex_unlock(&matrix_dev->lock); + mutex_unlock(&guest->kvm->lock); +} + +void vfio_ap_on_scan_complete(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info) +{ + struct ap_guest *guest; + + down_read(&matrix_dev->guests_lock); + + list_for_each_entry(guest, &matrix_dev->guests, node) { + if (bitmap_empty(guest->matrix_mdev->apm_add, AP_DEVICES) && + bitmap_empty(guest->matrix_mdev->aqm_add, AP_DOMAINS) && + bitmap_empty(guest->matrix_mdev->adm_add, AP_DOMAINS)) + continue; + + vfio_ap_mdev_hot_plug_cfg(guest); + bitmap_clear(guest->matrix_mdev->apm_add, 0, AP_DEVICES); + bitmap_clear(guest->matrix_mdev->aqm_add, 0, AP_DOMAINS); + bitmap_clear(guest->matrix_mdev->adm_add, 0, AP_DOMAINS); + } + + up_read(&matrix_dev->guests_lock); +} diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 97da41f87c65..affa63da7f88 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -37,7 +37,9 @@ struct ap_guest { * * @device: generic device structure associated with the AP matrix device * @available_instances: number of mediated matrix devices that can be created - * @info: the struct containing the output from the PQAP(QCI) instruction + * @config_info: the struct containing the output from the PQAP(QCI) instruction + * @config_info_prev: the struct containing the previous output from the + * PQAP(AQIC) instruction * @mdev_list: the list of mediated matrix devices created * @lock: mutex for locking the AP matrix device. This lock will be * taken every time we fiddle with state managed by the vfio_ap @@ -52,7 +54,8 @@ struct ap_guest { struct ap_matrix_dev { struct device device; atomic_t available_instances; - struct ap_config_info info; + struct ap_config_info config_info; + struct ap_config_info config_info_prev; struct list_head mdev_list; struct mutex lock; struct ap_driver *vfio_ap_drv; @@ -110,6 +113,13 @@ struct ap_queue_table { * @mdev: the mediated device * @qtable: table of queues (struct vfio_ap_queue) assigned to the mdev * @guest: the KVM guest using the matrix mdev + * @apm_add: adapters to be hot plugged into the guest when the vfio_ap + * device driver is notified that the AP bus scan has completed. + * @aqm_add: domains to be hot plugged into the guest when the vfio_ap + * device driver is notified that the AP bus scan has completed. + * @adm_add: control domains to be hot plugged into the guest when the + * vfio_ap device driver is notified that the AP bus scan has + * completed. */ struct ap_matrix_mdev { struct vfio_device vdev; @@ -121,6 +131,9 @@ struct ap_matrix_mdev { struct mdev_device *mdev; struct ap_queue_table qtable; struct ap_guest *guest; + DECLARE_BITMAP(apm_add, AP_DEVICES); + DECLARE_BITMAP(aqm_add, AP_DOMAINS); + DECLARE_BITMAP(adm_add, AP_DOMAINS); }; /** @@ -151,4 +164,10 @@ void vfio_ap_mdev_remove_queue(struct ap_device *queue); int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm); + +void vfio_ap_on_cfg_changed(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info); +void vfio_ap_on_scan_complete(struct ap_config_info *new_config_info, + struct ap_config_info *old_config_info); + #endif /* _VFIO_AP_PRIVATE_H_ */ From patchwork Thu Oct 21 15:23:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 12575585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C873BC433EF for ; Thu, 21 Oct 2021 15:24:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B514060C49 for ; Thu, 21 Oct 2021 15:24:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232069AbhJUP1K (ORCPT ); Thu, 21 Oct 2021 11:27:10 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:28616 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S232248AbhJUP0r (ORCPT ); Thu, 21 Oct 2021 11:26:47 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19LEBKwn009389; Thu, 21 Oct 2021 11:24:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=AmsSE3nC4UClOY1jYZ52LkQpg1P/K6ExubPsUSNbQ7c=; b=B3M01U5cBana/jd6kH/tGxPL0osCArXMSeBpQTMYNmdVy4n8o6mIH92CaKWpwtFXDc2A LRk5t8GN4cuAXMoEzh2vMYRV3AYjvGK93HpxOVn7slZuU82x1g/9Hi8XS0TlGI2Gs2OA Txox363NKEMYv1dBA0eO7HjfVdiMNGVspQAPciS1a76dyU44+cY9KUlYgzN8BXeWZivG 0+MQkjlEKpeBeooIg1RWXDfvOacU7RDI51QM2HInoY/xNN4UypObP7XswoXHXxaRqOcI 3mN5SUXxUpxLplmN+LZ4pp4YdUgFFpLJnfffpNwkTAxQX0DQDrp3f4BAjP2nHdYsE6XL rQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu772wps2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:29 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19LF1tjw020644; Thu, 21 Oct 2021 11:24:28 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0b-001b2d01.pphosted.com with ESMTP id 3bu772wprr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 11:24:28 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19LF2LU7000328; Thu, 21 Oct 2021 15:24:27 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma01dal.us.ibm.com with ESMTP id 3bqpcdr7ct-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 15:24:27 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19LFOP5i48234812 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Oct 2021 15:24:25 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B7A82BE05B; Thu, 21 Oct 2021 15:24:25 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1B1CABE086; Thu, 21 Oct 2021 15:24:23 +0000 (GMT) Received: from cpe-172-100-181-211.stny.res.rr.com.com (unknown [9.160.98.118]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Oct 2021 15:24:22 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: jjherne@linux.ibm.com, freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, Tony Krowiak Subject: [PATCH v17 15/15] s390/vfio-ap: update docs to include dynamic config support Date: Thu, 21 Oct 2021 11:23:32 -0400 Message-Id: <20211021152332.70455-16-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211021152332.70455-1-akrowiak@linux.ibm.com> References: <20211021152332.70455-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: KPGXzP0tjDrwtp1l0t8-AnWcMdxrhoAh X-Proofpoint-GUID: FPAYY4p409AFQxUCLi9FkLtftlT6K9Yq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-21_04,2021-10-21_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 impostorscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 phishscore=0 suspectscore=0 adultscore=0 mlxscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110210079 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Update the documentation in vfio-ap.rst to include information about the AP dynamic configuration support (e.g., hot plug of adapters, domains and control domains via the matrix mediated device's sysfs assignment attributes). This patch also makes a few minor tweaks to make corrections and clarifications. Signed-off-by: Tony Krowiak --- Documentation/s390/vfio-ap.rst | 492 +++++++++++++++++++++++---------- 1 file changed, 346 insertions(+), 146 deletions(-) diff --git a/Documentation/s390/vfio-ap.rst b/Documentation/s390/vfio-ap.rst index f57ae621f33e..f4b8748ab9a8 100644 --- a/Documentation/s390/vfio-ap.rst +++ b/Documentation/s390/vfio-ap.rst @@ -123,27 +123,24 @@ Let's now take a look at how AP instructions executed on a guest are interpreted by the hardware. A satellite control block called the Crypto Control Block (CRYCB) is attached to -our main hardware virtualization control block. The CRYCB contains three fields -to identify the adapters, usage domains and control domains assigned to the KVM -guest: +our main hardware virtualization control block. The CRYCB contains an AP Control +Block (APCB) that has three fields to identify the adapters, usage domains and +control domains assigned to the KVM guest: * The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned - to the KVM guest. Each bit in the mask, from left to right (i.e. from most - significant to least significant bit in big endian order), corresponds to + to the KVM guest. Each bit in the mask, from left to right, corresponds to an APID from 0-255. If a bit is set, the corresponding adapter is valid for use by the KVM guest. * The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains - assigned to the KVM guest. Each bit in the mask, from left to right (i.e. from - most significant to least significant bit in big endian order), corresponds to - an AP queue index (APQI) from 0-255. If a bit is set, the corresponding queue - is valid for use by the KVM guest. + assigned to the KVM guest. Each bit in the mask, from left to right, + corresponds to an AP queue index (APQI) from 0-255. If a bit is set, the + corresponding queue is valid for use by the KVM guest. * The AP Domain Mask field is a bit mask that identifies the AP control domains assigned to the KVM guest. The ADM bit mask controls which domains can be changed by an AP command-request message sent to a usage domain from the - guest. Each bit in the mask, from left to right (i.e. from most significant to - least significant bit in big endian order), corresponds to a domain from + guest. Each bit in the mask, from left to right, corresponds to a domain from 0-255. If a bit is set, the corresponding domain can be modified by an AP command-request message sent to a usage domain. @@ -151,10 +148,10 @@ If you recall from the description of an AP Queue, AP instructions include an APQN to identify the AP queue to which an AP command-request message is to be sent (NQAP and PQAP instructions), or from which a command-reply message is to be received (DQAP instruction). The validity of an APQN is defined by the matrix -calculated from the APM and AQM; it is the cross product of all assigned adapter -numbers (APM) with all assigned queue indexes (AQM). For example, if adapters 1 -and 2 and usage domains 5 and 6 are assigned to a guest, the APQNs (1,5), (1,6), -(2,5) and (2,6) will be valid for the guest. +calculated from the APM and AQM; it is the Cartesian product of all assigned +adapter numbers (APM) with all assigned queue indexes (AQM). For example, if +adapters 1 and 2 and usage domains 5 and 6 are assigned to a guest, the APQNs +(1,5), (1,6), (2,5) and (2,6) will be valid for the guest. The APQNs can provide secure key functionality - i.e., a private key is stored on the adapter card for each of its domains - so each APQN must be assigned to @@ -192,7 +189,7 @@ The design introduces three new objects: 1. AP matrix device 2. VFIO AP device driver (vfio_ap.ko) -3. VFIO AP mediated matrix pass-through device +3. VFIO AP mediated pass-through device The VFIO AP device driver ------------------------- @@ -200,12 +197,13 @@ The VFIO AP (vfio_ap) device driver serves the following purposes: 1. Provides the interfaces to secure APQNs for exclusive use of KVM guests. -2. Sets up the VFIO mediated device interfaces to manage a mediated matrix +2. Sets up the VFIO mediated device interfaces to manage a vfio_ap mediated device and creates the sysfs interfaces for assigning adapters, usage domains, and control domains comprising the matrix for a KVM guest. -3. Configures the APM, AQM and ADM in the CRYCB referenced by a KVM guest's - SIE state description to grant the guest access to a matrix of AP devices +3. Configures the APM, AQM and ADM in the APCB contained in the CRYCB referenced + by a KVM guest's SIE state description to grant the guest access to a matrix + of AP devices Reserve APQNs for exclusive use of KVM guests --------------------------------------------- @@ -235,10 +233,10 @@ reserved:: | | 8 probe | | +--------^---------+ +--^--^------------+ 6 edit | | | - apmask | +-----------------------------+ | 9 mdev create + apmask | +-----------------------------+ | 11 mdev create aqmask | | 1 modprobe | +--------+-----+---+ +----------------+-+ +----------------+ - | | | |8 create | mediated | + | | | |10 create| mediated | | admin | | VFIO device core |---------> matrix | | + | | | device | +------+-+---------+ +--------^---------+ +--------^-------+ @@ -246,14 +244,14 @@ reserved:: | | 9 create vfio_ap-passthrough | | | +------------------------------+ | +-------------------------------------------------------------+ - 10 assign adapter/domain/control domain + 12 assign adapter/domain/control domain The process for reserving an AP queue for use by a KVM guest is: 1. The administrator loads the vfio_ap device driver 2. The vfio-ap driver during its initialization will register a single 'matrix' device with the device core. This will serve as the parent device for - all mediated matrix devices used to configure an AP matrix for a guest. + all vfio_ap mediated devices used to configure an AP matrix for a guest. 3. The /sys/devices/vfio_ap/matrix device is created by the device core 4. The vfio_ap device driver will register with the AP bus for AP queue devices of type 10 and higher (CEX4 and newer). The driver will provide the vfio_ap @@ -269,24 +267,24 @@ The process for reserving an AP queue for use by a KVM guest is: default zcrypt cex4queue driver. 8. The AP bus probes the vfio_ap device driver to bind the queues reserved for it. -9. The administrator creates a passthrough type mediated matrix device to be +9. The administrator creates a passthrough type vfio_ap mediated device to be used by a guest 10. The administrator assigns the adapters, usage domains and control domains to be exclusively used by a guest. Set up the VFIO mediated device interfaces ------------------------------------------ -The VFIO AP device driver utilizes the common interface of the VFIO mediated +The VFIO AP device driver utilizes the common interfaces of the VFIO mediated device core driver to: -* Register an AP mediated bus driver to add a mediated matrix device to and +* Register an AP mediated bus driver to add a vfio_ap mediated device to and remove it from a VFIO group. -* Create and destroy a mediated matrix device -* Add a mediated matrix device to and remove it from the AP mediated bus driver -* Add a mediated matrix device to and remove it from an IOMMU group +* Create and destroy a vfio_ap mediated device +* Add a vfio_ap mediated device to and remove it from the AP mediated bus driver +* Add a vfio_ap mediated device to and remove it from an IOMMU group The following high-level block diagram shows the main components and interfaces -of the VFIO AP mediated matrix device driver:: +of the VFIO AP mediated device driver:: +-------------+ | | @@ -343,7 +341,7 @@ matrix device. * device_api: the mediated device type's API * available_instances: - the number of mediated matrix passthrough devices + the number of vfio_ap mediated passthrough devices that can be created * device_api: specifies the VFIO API @@ -351,29 +349,37 @@ matrix device. This attribute group identifies the user-defined sysfs attributes of the mediated device. When a device is registered with the VFIO mediated device framework, the sysfs attribute files identified in the 'mdev_attr_groups' - structure will be created in the mediated matrix device's directory. The - sysfs attributes for a mediated matrix device are: + structure will be created in the vfio_ap mediated device's directory. The + sysfs attributes for a vfio_ap mediated device are: assign_adapter / unassign_adapter: Write-only attributes for assigning/unassigning an AP adapter to/from the - mediated matrix device. To assign/unassign an adapter, the APID of the - adapter is echoed to the respective attribute file. + vfio_ap mediated device. To assign/unassign an adapter, the APID of the + adapter is echoed into the respective attribute file. assign_domain / unassign_domain: Write-only attributes for assigning/unassigning an AP usage domain to/from - the mediated matrix device. To assign/unassign a domain, the domain - number of the usage domain is echoed to the respective attribute + the vfio_ap mediated device. To assign/unassign a domain, the domain + number of the usage domain is echoed into the respective attribute file. matrix: - A read-only file for displaying the APQNs derived from the cross product - of the adapter and domain numbers assigned to the mediated matrix device. + A read-only file for displaying the APQNs derived from the Cartesian + product of the adapter and domain numbers assigned to the vfio_ap mediated + device. + guest_matrix: + A read-only file for displaying the APQNs derived from the Cartesian + product of the adapter and domain numbers assigned to the APM and AQM + fields respectively of the KVM guest's CRYCB. This may differ from the + the APQNs assigned to the vfio_ap mediated device if any APQN does not + reference a queue device bound to the vfio_ap device driver (i.e., the + queue is not in the host's AP configuration). assign_control_domain / unassign_control_domain: Write-only attributes for assigning/unassigning an AP control domain - to/from the mediated matrix device. To assign/unassign a control domain, - the ID of the domain to be assigned/unassigned is echoed to the respective - attribute file. + to/from the vfio_ap mediated device. To assign/unassign a control domain, + the ID of the domain to be assigned/unassigned is echoed into the + respective attribute file. control_domains: A read-only file for displaying the control domain numbers assigned to the - mediated matrix device. + vfio_ap mediated device. * functions: @@ -383,45 +389,75 @@ matrix device. * Store the reference to the KVM structure for the guest using the mdev * Store the AP matrix configuration for the adapters, domains, and control domains assigned via the corresponding sysfs attributes files + * Store the AP matrix configuration for the adapters, domains and control + domains available to a guest. A guest may not be provided access to APQNs + referencing queue devices that do not exist, or are not bound to the + vfio_ap device driver. remove: - deallocates the mediated matrix device's ap_matrix_mdev structure. This will - be allowed only if a running guest is not using the mdev. + deallocates the vfio_ap mediated device's ap_matrix_mdev structure. + This will be allowed only if a running guest is not using the mdev. * callback interfaces - open: + open_device: The vfio_ap driver uses this callback to register a - VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix - device. The open is invoked when QEMU connects the VFIO iommu group - for the mdev matrix device to the MDEV bus. Access to the KVM structure used - to configure the KVM guest is provided via this callback. The KVM structure, - is used to configure the guest's access to the AP matrix defined via the - mediated matrix device's sysfs attribute files. - release: + VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the matrix mdev + devices. The open_device callback is invoked by userspace to connect the + VFIO iommu group for the matrix mdev device to the MDEV bus. Access to the + KVM structure used to configure the KVM guest is provided via this callback. + The KVM structure, is used to configure the guest's access to the AP matrix + defined via the vfio_ap mediated device's sysfs attribute files. + + close_device: unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the - mdev matrix device and deconfigures the guest's AP matrix. + matrix mdev device and deconfigures the guest's AP matrix. -Configure the APM, AQM and ADM in the CRYCB -------------------------------------------- -Configuring the AP matrix for a KVM guest will be performed when the + ioctl: + this callback handles the VFIO_DEVICE_GET_INFO and VFIO_DEVICE_RESET ioctls + defined by the vfio framework. + +Configure the guest's AP resources +---------------------------------- +Configuring the AP resources for a KVM guest will be performed when the VFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier -function is called when QEMU connects to KVM. The guest's AP matrix is -configured via it's CRYCB by: +function is called when userspace connects to KVM. The guest's AP resources are +configured via it's APCB by: * Setting the bits in the APM corresponding to the APIDs assigned to the - mediated matrix device via its 'assign_adapter' interface. + vfio_ap mediated device via its 'assign_adapter' interface. * Setting the bits in the AQM corresponding to the domains assigned to the - mediated matrix device via its 'assign_domain' interface. + vfio_ap mediated device via its 'assign_domain' interface. * Setting the bits in the ADM corresponding to the domain dIDs assigned to the - mediated matrix device via its 'assign_control_domains' interface. + vfio_ap mediated device via its 'assign_control_domains' interface. + +The linux device model precludes passing a device through to a KVM guest that +is not bound to the device driver facilitating its pass-through. Consequently, +an APQN that does not reference a queue device bound to the vfio_ap device +driver will not be assigned to a KVM guest's matrix. The AP architecture, +however, does not provide a means to filter individual APQNs from the guest's +matrix, so the adapters, domains and control domains assigned to vfio_ap +mediated device via its sysfs 'assign_adapter', 'assign_domain' and +'assign_control_domain' interfaces will be filtered before providing the AP +configuration to a guest: + +* The APIDs of the adapters, the APQIs of the domains and the domain numbers of + the control domains assigned to the matrix mdev that are not also assigned to + the host's AP configuration will be filtered. + +* Each APQN derived from the Cartesian product of the APIDs and APQIs assigned + to the vfio_ap mdev is examined and if any one of them does not reference a + queue device bound to the vfio_ap device driver, the adapter will not be + plugged into the guest (i.e., the bit corresponding to its APID will not be + set in the APM of the guest's APCB). The CPU model features for AP ----------------------------- -The AP stack relies on the presence of the AP instructions as well as two -facilities: The AP Facilities Test (APFT) facility; and the AP Query -Configuration Information (QCI) facility. These features/facilities are made -available to a KVM guest via the following CPU model features: +The AP stack relies on the presence of the AP instructions as well as three +facilities: The AP Facilities Test (APFT) facility; the AP Query +Configuration Information (QCI) facility; and the AP Queue Interruption Control +facility. These features/facilities are made available to a KVM guest via the +following CPU model features: 1. ap: Indicates whether the AP instructions are installed on the guest. This feature will be enabled by KVM only if the AP instructions are installed @@ -435,24 +471,28 @@ available to a KVM guest via the following CPU model features: can be made available to the guest only if it is available on the host (i.e., facility bit 12 is set). +4. apqi: Indicates AP Queue Interruption Control faclity is available on the + guest. This facility can be made available to the guest only if it is + available on the host (i.e., facility bit 65 is set). + Note: If the user chooses to specify a CPU model different than the 'host' model to QEMU, the CPU model features and facilities need to be turned on explicitly; for example:: - /usr/bin/qemu-system-s390x ... -cpu z13,ap=on,apqci=on,apft=on + /usr/bin/qemu-system-s390x ... -cpu z13,ap=on,apqci=on,apft=on,apqi=on A guest can be precluded from using AP features/facilities by turning them off explicitly; for example:: - /usr/bin/qemu-system-s390x ... -cpu host,ap=off,apqci=off,apft=off + /usr/bin/qemu-system-s390x ... -cpu host,ap=off,apqci=off,apft=off,apqi=off Note: If the APFT facility is turned off (apft=off) for the guest, the guest -will not see any AP devices. The zcrypt device drivers that register for type 10 -and newer AP devices - i.e., the cex4card and cex4queue device drivers - need -the APFT facility to ascertain the facilities installed on a given AP device. If -the APFT facility is not installed on the guest, then the probe of device -drivers will fail since only type 10 and newer devices can be configured for -guest use. +will not see any AP devices. The zcrypt device drivers on the guest that +register for type 10 and newer AP devices - i.e., the cex4card and cex4queue +device drivers - need the APFT facility to ascertain the facilities installed on +a given AP device. If the APFT facility is not installed on the guest, then no +adapter or domain devices will get created by the AP bus running on the +guest because only type 10 and newer devices can be configured for guest use. Example ======= @@ -471,7 +511,7 @@ CARD.DOMAIN TYPE MODE 05.00ab CEX5C CCA-Coproc 06 CEX5A Accelerator 06.0004 CEX5A Accelerator -06.00ab CEX5C CCA-Coproc +06.00ab CEX5A Accelerator =========== ===== ============ Guest2 @@ -479,9 +519,9 @@ Guest2 =========== ===== ============ CARD.DOMAIN TYPE MODE =========== ===== ============ -05 CEX5A Accelerator -05.0047 CEX5A Accelerator -05.00ff CEX5A Accelerator +05 CEX5C CCA-Coproc +05.0047 CEX5C CCA-Coproc +05.00ff CEX5C CCA-Coproc =========== ===== ============ Guest3 @@ -529,40 +569,56 @@ These are the steps: 2. Secure the AP queues to be used by the three guests so that the host can not access them. To secure them, there are two sysfs files that specify - bitmasks marking a subset of the APQN range as 'usable by the default AP - queue device drivers' or 'not usable by the default device drivers' and thus - available for use by the vfio_ap device driver'. The location of the sysfs - files containing the masks are:: + bitmasks marking a subset of the APQN range as usable only by the default AP + queue device drivers. All remaining APQNs are available for use by + any other device driver. The vfio_ap device driver is currently the only + non-default device driver. The location of the sysfs files containing the + masks are:: /sys/bus/ap/apmask /sys/bus/ap/aqmask The 'apmask' is a 256-bit mask that identifies a set of AP adapter IDs - (APID). Each bit in the mask, from left to right (i.e., from most significant - to least significant bit in big endian order), corresponds to an APID from - 0-255. If a bit is set, the APID is marked as usable only by the default AP - queue device drivers; otherwise, the APID is usable by the vfio_ap - device driver. + (APID). Each bit in the mask, from left to right, corresponds to an APID from + 0-255. If a bit is set, the APID belongs to the subset of APQNs marked as + available only to the default AP queue device drivers. The 'aqmask' is a 256-bit mask that identifies a set of AP queue indexes - (APQI). Each bit in the mask, from left to right (i.e., from most significant - to least significant bit in big endian order), corresponds to an APQI from - 0-255. If a bit is set, the APQI is marked as usable only by the default AP - queue device drivers; otherwise, the APQI is usable by the vfio_ap device - driver. + (APQI). Each bit in the mask, from left to right, corresponds to an APQI from + 0-255. If a bit is set, the APQI belongs to the subset of APQNs marked as + available only to the default AP queue device drivers. + + The Cartesian product of the APIDs corresponding to the bits set in the + apmask and the APQIs corresponding to the bits set in the aqmask comprise + the subset of APQNs that can be used only by the host default device drivers. + All other APQNs are available to the non-default device drivers such as the + vfio_ap driver. + + Take, for example, the following masks:: + + apmask: + 0x7d00000000000000000000000000000000000000000000000000000000000000 - Take, for example, the following mask:: + aqmask: + 0x8000000000000000000000000000000000000000000000000000000000000000 - 0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff + The masks indicate: - It indicates: + * Adapters 1, 2, 3, 4, 5, and 7 are available for use by the host default + device drivers. - 1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6 - belong to the vfio_ap device driver's pool. + * Domain 0 is available for use by the host default device drivers + + * The subset of APQNs available for use only by the default host device + drivers are: + + (1,0), (2,0), (3,0), (4.0), (5,0) and (7,0) + + * All other APQNs are available for use by the non-default device drivers. The APQN of each AP queue device assigned to the linux host is checked by the - AP bus against the set of APQNs derived from the cross product of APIDs - and APQIs marked as usable only by the default AP queue device drivers. If a + AP bus against the set of APQNs derived from the Cartesian product of APIDs + and APQIs marked as available to the default AP queue device drivers. If a match is detected, only the default AP queue device drivers will be probed; otherwise, the vfio_ap device driver will be probed. @@ -579,8 +635,7 @@ These are the steps: 0x4100000000000000000000000000000000000000000000000000000000000000 - Keep in mind that the mask reads from left to right (i.e., most - significant to least significant bit in big endian order), so the mask + Keep in mind that the mask reads from left to right, so the mask above identifies device numbers 1 and 7 (01000001). If the string is longer than the mask, the operation is terminated with @@ -626,11 +681,22 @@ These are the steps: default drivers pool: adapter 0-15, domain 1 alternate drivers pool: adapter 16-255, domains 0, 2-255 + Note ***: + Changing a mask such that one or more APQNs will be taken from a vfio_ap + mediated device (see below) will fail with an error (EBUSY). A message + is logged to the kernel ring buffer which can be viewed with the 'dmesg' + command. The output identifies each APQN flagged as 'in use' and identifies + the vfio_ap mediated device to which it is assigned; for example: + + Userspace may not re-assign queue 05.0054 already assigned to 62177883-f1bb-47f0-914d-32a22e3a8804 + Userspace may not re-assign queue 04.0054 already assigned to cef03c3c-903d-4ecc-9a83-40694cb8aee4 + Securing the APQNs for our example ---------------------------------- To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver, the corresponding - APQNs can either be removed from the default masks:: + APQNs can be removed from the default masks using either of the following + commands:: echo -5,-6 > /sys/bus/ap/apmask @@ -683,7 +749,7 @@ Securing the APQNs for our example /sys/devices/vfio_ap/matrix/ --- [mdev_supported_types] - ------ [vfio_ap-passthrough] (passthrough mediated matrix device type) + ------ [vfio_ap-passthrough] (passthrough vfio_ap mediated device type) --------- create --------- [devices] @@ -734,6 +800,9 @@ Securing the APQNs for our example ----------------unassign_control_domain ----------------unassign_domain + Note *****: The vfio_ap mdevs do not persist across reboots unless the + mdevctl tool is used to create and persist them. + 4. The administrator now needs to configure the matrixes for the mediated devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3). @@ -755,6 +824,10 @@ Securing the APQNs for our example cat matrix + To display the matrix that is or will be assigned to Guest1:: + + cat guest_matrix + This is how the matrix is configured for Guest2:: echo 5 > assign_adapter @@ -774,17 +847,24 @@ Securing the APQNs for our example higher than the maximum is specified, the operation will terminate with an error (ENODEV). - * All APQNs that can be derived from the adapter ID and the IDs of - the previously assigned domains must be bound to the vfio_ap device - driver. If no domains have yet been assigned, then there must be at least - one APQN with the specified APID bound to the vfio_ap driver. If no such - APQNs are bound to the driver, the operation will terminate with an - error (EADDRNOTAVAIL). + Note: The maximum adapter number can be obtained via the sysfs + /sys/bus/ap/ap_max_adapter_id attribute file. + + * Each APQN derived from the Cartesian product of the APID of the adapter + being assigned and the APQIs of the domains previously assigned: - No APQN that can be derived from the adapter ID and the IDs of the - previously assigned domains can be assigned to another mediated matrix - device. If an APQN is assigned to another mediated matrix device, the - operation will terminate with an error (EADDRINUSE). + - Must only be available to the vfio_ap device driver as specified in the + sysfs /sys/bus/ap/apmask and /sys/bus/ap/aqmask attribute files. If even + one APQN is reserved for use by the host device driver, the operation + will terminate with an error (EADDRNOTAVAIL). + + - Must NOT be assigned to another vfio_ap mediated device. If even one APQN + is assigned to another vfio_ap mediated device, the operation will + terminate with an error (EBUSY). + + - Must NOT be assigned while the sysfs /sys/bus/ap/apmask and + sys/bus/ap/aqmask attribute files are being edited or the operation may + terminate with an error (EBUSY). In order to successfully assign a domain: @@ -793,41 +873,50 @@ Securing the APQNs for our example higher than the maximum is specified, the operation will terminate with an error (ENODEV). - * All APQNs that can be derived from the domain ID and the IDs of - the previously assigned adapters must be bound to the vfio_ap device - driver. If no domains have yet been assigned, then there must be at least - one APQN with the specified APQI bound to the vfio_ap driver. If no such - APQNs are bound to the driver, the operation will terminate with an - error (EADDRNOTAVAIL). + Note: The maximum domain number can be obtained via the sysfs + /sys/bus/ap/ap_max_domain_id attribute file. + + * Each APQN derived from the Cartesian product of the APQI of the domain + being assigned and the APIDs of the adapters previously assigned: - No APQN that can be derived from the domain ID and the IDs of the - previously assigned adapters can be assigned to another mediated matrix - device. If an APQN is assigned to another mediated matrix device, the - operation will terminate with an error (EADDRINUSE). + - Must only be available to the vfio_ap device driver as specified in the + sysfs /sys/bus/ap/apmask and /sys/bus/ap/aqmask attribute files. If even + one APQN is reserved for use by the host device driver, the operation + will terminate with an error (EADDRNOTAVAIL). - In order to successfully assign a control domain, the domain number - specified must represent a value from 0 up to the maximum domain number - configured for the system. If a control domain number higher than the maximum - is specified, the operation will terminate with an error (ENODEV). + - Must NOT be assigned to another vfio_ap mediated device. If even one APQN + is assigned to another vfio_ap mediated device, the operation will + terminate with an error (EBUSY). + + - Must NOT be assigned while the sysfs /sys/bus/ap/apmask and + sys/bus/ap/aqmask attribute files are being edited or the operation may + terminate with an error (EBUSY). + + In order to successfully assign a control domain: + + * The domain number specified must represent a value from 0 up to the maximum + domain number configured for the system. If a control domain number higher + than the maximum is specified, the operation will terminate with an + error (ENODEV). 5. Start Guest1:: - /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ + /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on,apqi=on \ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ... 7. Start Guest2:: - /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ + /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on,apqi=on \ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ... 7. Start Guest3:: - /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ + /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on,apqi=on \ -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ... -When the guest is shut down, the mediated matrix devices may be removed. +When the guest is shut down, the vfio_ap mediated devices may be removed. -Using our example again, to remove the mediated matrix device $uuid1:: +Using our example again, to remove the vfio_ap mediated device $uuid1:: /sys/devices/vfio_ap/matrix/ --- [mdev_supported_types] @@ -840,26 +929,137 @@ Using our example again, to remove the mediated matrix device $uuid1:: echo 1 > remove -This will remove all of the mdev matrix device's sysfs structures including -the mdev device itself. To recreate and reconfigure the mdev matrix device, +This will remove all of the matrix mdev device's sysfs structures including +the mdev device itself. To recreate and reconfigure the matrix mdev device, all of the steps starting with step 3 will have to be performed again. Note -that the remove will fail if a guest using the mdev is still running. +that the remove will fail if a guest using the vfio_ap mdev is still running. -It is not necessary to remove an mdev matrix device, but one may want to +It is not necessary to remove a vfio_ap mdev, but one may want to remove it if no guest will use it during the remaining lifetime of the linux -host. If the mdev matrix device is removed, one may want to also reconfigure +host. If the vfio_ap mdev is removed, one may want to also reconfigure the pool of adapters and queues reserved for use by the default drivers. +Hot plug/unplug support: +================ +An adapter, domain or control domain may be hot plugged into a running KVM +guest by assigning it to the vfio_ap mediated device being used by the guest if +the following conditions are met: + +* The adapter, domain or control domain must also be assigned to the host's + AP configuration. + +* Each APQN derived from the Cartesian product comprised of the APID of the + adapter being assigned and the APQIs of the domains assigned must reference a + queue device bound to the vfio_ap device driver. + +* To hot plug a domain, each APQN derived from the Cartesian product + comprised of the APQI of the domain being assigned and the APIDs of the + adapters assigned must reference a queue device bound to the vfio_ap device + driver. + +An adapter, domain or control domain may be hot unplugged from a running KVM +guest by unassigning it from the vfio_ap mediated device being used by the +guest. + +Over-provisioning of AP queues for a KVM guest: +============================================== +Over-provisioning is defined herein as the assignment of adapters or domains to +a vfio_ap mediated device that do not reference AP devices in the host's AP +configuration. The idea here is that when the adapter or domain becomes +available, it will be automatically hot-plugged into the KVM guest using +the vfio_ap mediated device to which it is assigned as long as each new APQN +resulting from plugging it in references a queue device bound to the vfio_ap +device driver. + Limitations =========== -* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN - to the default drivers pool of a queue that is still assigned to a mediated - device in use by a guest. It is incumbent upon the administrator to - ensure there is no mediated device in use by a guest to which the APQN is - assigned lest the host be given access to the private data of the AP queue - device such as a private key configured specifically for the guest. +Live guest migration is not supported for guests using AP devices without +intervention by a system administrator. Before a KVM guest can be migrated, +the vfio_ap mediated device must be removed. Unfortunately, it can not be +removed manually (i.e., echo 1 > /sys/devices/vfio_ap/matrix/$UUID/remove) while +the mdev is in use by a KVM guest. If the guest is being emulated by QEMU, +its mdev can be hot unplugged from the guest in one of two ways: + +1. If the KVM guest was started with libvirt, you can hot unplug the mdev via + the following commands: + + virsh detach-device + + For example, to hot unplug mdev 62177883-f1bb-47f0-914d-32a22e3a8804 from + the guest named 'my-guest': + + virsh detach-device my-guest ~/config/my-guest-hostdev.xml + + The contents of my-guest-hostdev.xml: + + + +
+ + + + + virsh qemu-monitor-command --hmp "device-del " + + For example, to hot unplug the vfio_ap mediated device identified on the + qemu command line with 'id=hostdev0' from the guest named 'my-guest': + + virsh qemu-monitor-command my-guest --hmp "device_del hostdev0" + +2. A vfio_ap mediated device can be hot unplugged by attaching the qemu monitor + to the guest and using the following qemu monitor command: + + (QEMU) device-del id= + + For example, to hot unplug the vfio_ap mediated device that was specified + on the qemu command line with 'id=hostdev0' when the guest was started: + + (QEMU) device-del id=hostdev0 + +After live migration of the KVM guest completes, an AP configuration can be +restored to the KVM guest by hot plugging a vfio_ap mediated device on the target +system into the guest in one of two ways: + +1. If the KVM guest was started with libvirt, you can hot plug a matrix mediated + device into the guest via the following virsh commands: + + virsh attach-device + + For example, to hot plug mdev 62177883-f1bb-47f0-914d-32a22e3a8804 into + the guest named 'my-guest': + + virsh attach-device my-guest ~/config/my-guest-hostdev.xml + + The contents of my-guest-hostdev.xml: + + + +
+ + + + + virsh qemu-monitor-command --hmp \ + "device_add vfio-ap,sysfsdev=,id=" + + For example, to hot plug the vfio_ap mediated device + 62177883-f1bb-47f0-914d-32a22e3a8804 into the guest named 'my-guest' with + device-id hostdev0: + + virsh qemu-monitor-command my-guest --hmp \ + "device_add vfio-ap,\ + sysfsdev=/sys/devices/vfio_ap/matrix/62177883-f1bb-47f0-914d-32a22e3a8804,\ + id=hostdev0" + +2. A vfio_ap mediated device can be hot plugged by attaching the qemu monitor + to the guest and using the following qemu monitor command: + + (qemu) device_add "vfio-ap,sysfsdev=,id=" -* Dynamically modifying the AP matrix for a running guest (which would amount to - hot(un)plug of AP devices for the guest) is currently not supported + For example, to plug the vfio_ap mediated device + 62177883-f1bb-47f0-914d-32a22e3a8804 into the guest with the device-id + hostdev0: -* Live guest migration is not supported for guests using AP devices. + (QEMU) device-add "vfio-ap,\ + sysfsdev=/sys/devices/vfio_ap/matrix/62177883-f1bb-47f0-914d-32a22e3a8804,\ + id=hostdev0"