From patchwork Tue Nov 24 21:40:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Krowiak X-Patchwork-Id: 11929901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 488D2C64E7D for ; Tue, 24 Nov 2020 21:40:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B3D52083E for ; Tue, 24 Nov 2020 21:40:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="QjE86BMV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733114AbgKXVkk (ORCPT ); Tue, 24 Nov 2020 16:40:40 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:39480 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732876AbgKXVkj (ORCPT ); Tue, 24 Nov 2020 16:40:39 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0AOLXpjc059118; Tue, 24 Nov 2020 16:40:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=tZBLckoTsEXuuGE5/NNQ2q339Xl2PF7i5JIVmizJzZg=; b=QjE86BMVqyJSQaa/TCaPe5Q/BE041w7Zyd0LzahrN4y3FKvPy1oURxsOtC+vRbv9AG53 MVpBqrhiYtNlp3yE01AaX9qSxPMhHrXG5dwADt/t0Mqpq9jv87vcmvnzXm4FDb9TE+Ry DpLX+SJPrQN7IFDuxne5ZZNiHQ0OrjWerAton/6FemqmE3TZTm0WlDCzN149obohGSiH 0AxUJtGOMGELD9PZWGo/+5RD6PSf+6yh4IZaJjdymWhRpS8g2KbTyQx63k9pXMMu3pg6 msLLbihSMnlQYX2oGLmZ8uanucx01of0RAeT3lU2dJWUm/0WpNMugyi5cUdEMsY+FQnJ ng== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 350ga3fwww-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 Nov 2020 16:40:37 -0500 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0AOLZ4pG067325; Tue, 24 Nov 2020 16:40:36 -0500 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 350ga3fwwe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 Nov 2020 16:40:36 -0500 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0AOLc6B4032740; Tue, 24 Nov 2020 21:40:35 GMT Received: from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com [9.57.198.23]) by ppma03dal.us.ibm.com with ESMTP id 34xth9hb02-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 Nov 2020 21:40:35 +0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0AOLeYwI2884152 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 24 Nov 2020 21:40:34 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ED5BAAE05F; Tue, 24 Nov 2020 21:40:33 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 287BDAE05C; Tue, 24 Nov 2020 21:40:33 +0000 (GMT) Received: from cpe-66-24-58-13.stny.res.rr.com.com (unknown [9.85.195.249]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 24 Nov 2020 21:40:33 +0000 (GMT) From: Tony Krowiak To: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: freude@linux.ibm.com, borntraeger@de.ibm.com, cohuck@redhat.com, mjrosato@linux.ibm.com, pasic@linux.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, fiuczy@linux.ibm.com, frankja@linux.ibm.com, david@redhat.com, hca@linux.ibm.com, gor@linux.ibm.com, Tony Krowiak Subject: [PATCH v12 07/17] s390/vfio-ap: implement in-use callback for vfio_ap driver Date: Tue, 24 Nov 2020 16:40:06 -0500 Message-Id: <20201124214016.3013-8-akrowiak@linux.ibm.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20201124214016.3013-1-akrowiak@linux.ibm.com> References: <20201124214016.3013-1-akrowiak@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312,18.0.737 definitions=2020-11-24_08:2020-11-24,2020-11-24 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 malwarescore=0 impostorscore=0 suspectscore=3 mlxscore=0 phishscore=0 priorityscore=1501 clxscore=1015 adultscore=0 spamscore=0 bulkscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2011240125 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's implement the callback to indicate when an APQN is in use by the vfio_ap device driver. The callback is invoked whenever a change to the apmask or aqmask would result in one or more queue devices being removed from the driver. The vfio_ap device driver will indicate a resource is in use if the APQN of any of the queue devices to be removed are assigned to any of the matrix mdevs under the driver's control. There is potential for a deadlock condition between the matrix_dev->lock used to lock the matrix device during assignment of adapters and domains and the ap_perms_mutex locked by the AP bus when changes are made to the sysfs apmask/aqmask attributes. Consider following scenario (courtesy of Halil Pasic): 1) apmask_store() takes ap_perms_mutex 2) assign_adapter_store() takes matrix_dev->lock 3) apmask_store() calls vfio_ap_mdev_resource_in_use() which tries to take matrix_dev->lock 4) assign_adapter_store() calls ap_apqn_in_matrix_owned_by_def_drv which tries to take ap_perms_mutex BANG! To resolve this issue, instead of using the mutex_lock(&matrix_dev->lock) function to lock the matrix device during assignment of an adapter or domain to a matrix_mdev as well as during the in_use callback, the mutex_trylock(&matrix_dev->lock) function will be used. If the lock is not obtained, then the assignment and in_use functions will terminate with -EBUSY. Signed-off-by: Tony Krowiak --- drivers/s390/crypto/vfio_ap_drv.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 96 +++++++++++++++++++-------- drivers/s390/crypto/vfio_ap_private.h | 2 + 3 files changed, 71 insertions(+), 28 deletions(-) diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index 73bd073fd5d3..8934471b7944 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -147,6 +147,7 @@ static int __init vfio_ap_init(void) memset(&vfio_ap_drv, 0, sizeof(vfio_ap_drv)); vfio_ap_drv.probe = vfio_ap_mdev_probe_queue; vfio_ap_drv.remove = vfio_ap_mdev_remove_queue; + vfio_ap_drv.in_use = vfio_ap_mdev_resource_in_use; vfio_ap_drv.ids = ap_queue_ids; ret = ap_driver_register(&vfio_ap_drv, THIS_MODULE, VFIO_AP_DRV_NAME); diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 07caf871943c..3c2479d7e674 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -520,18 +520,40 @@ vfio_ap_mdev_verify_queues_reserved_for_apid(struct ap_matrix_mdev *matrix_mdev, return 0; } +#define MDEV_SHARING_ERR "Userspace may not re-assign queue %02lx.%04lx " \ + "already assigned to %s" + +static void vfio_ap_mdev_log_sharing_err(const char *mdev_name, + unsigned long *apm, + unsigned long *aqm) +{ + unsigned long apid, apqi; + + for_each_set_bit_inv(apid, apm, AP_DEVICES) + for_each_set_bit_inv(apqi, aqm, AP_DOMAINS) + pr_warn(MDEV_SHARING_ERR, apid, apqi, mdev_name); +} + /** * vfio_ap_mdev_verify_no_sharing * - * Verifies that the APQNs derived from the cross product of the AP adapter IDs - * and AP queue indexes comprising the AP matrix are not configured for another + * Verifies that each APQN derived from the cross product of the AP adapter IDs + * and AP queue indexes comprising an AP matrix is not assigned to a * mediated device. AP queue sharing is not allowed. * - * @matrix_mdev: the mediated matrix device + * @matrix_mdev: the mediated matrix device to which the APQNs being verified + * are assigned. If the value is not NULL, then verification will + * proceed for all other matrix mediated devices; otherwise, all + * matrix mediated devices will be verified. + * @mdev_apm: mask indicating the APIDs of the APQNs to be verified + * @mdev_aqm: mask indicating the APQIs of the APQNs to be verified * - * Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE. + * Returns 0 if no APQNs are not shared, otherwise; returns -EBUSY if one + * or more APQNs are shared. */ -static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev) +static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev, + unsigned long *mdev_apm, + unsigned long *mdev_aqm) { struct ap_matrix_mdev *lstdev; DECLARE_BITMAP(apm, AP_DEVICES); @@ -548,15 +570,16 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev) * We work on full longs, as we can only exclude the leftover * bits in non-inverse order. The leftover is all zeros. */ - if (!bitmap_and(apm, matrix_mdev->matrix.apm, - lstdev->matrix.apm, AP_DEVICES)) + if (!bitmap_and(apm, mdev_apm, lstdev->matrix.apm, AP_DEVICES)) continue; - if (!bitmap_and(aqm, matrix_mdev->matrix.aqm, - lstdev->matrix.aqm, AP_DOMAINS)) + if (!bitmap_and(aqm, mdev_aqm, lstdev->matrix.aqm, AP_DOMAINS)) continue; - return -EADDRINUSE; + vfio_ap_mdev_log_sharing_err(dev_name(mdev_dev(lstdev->mdev)), + apm, aqm); + + return -EBUSY; } return 0; @@ -670,10 +693,10 @@ static void vfio_ap_mdev_manage_qlinks(struct ap_matrix_mdev *matrix_mdev, * driver; or, if no APQIs have yet been assigned, the APID is not * contained in an APQN bound to the vfio_ap device driver. * - * 4. -EADDRINUSE + * 4. -EBUSY * An APQN derived from the cross product of the APID being assigned * and the APQIs previously assigned is being used by another mediated - * matrix device + * matrix device, or the matrix_dev-> lock could not be acquired. */ static ssize_t assign_adapter_store(struct device *dev, struct device_attribute *attr, @@ -681,6 +704,7 @@ static ssize_t assign_adapter_store(struct device *dev, { int ret; unsigned long apid; + DECLARE_BITMAP(apm, AP_DEVICES); struct mdev_device *mdev = mdev_from_dev(dev); struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); @@ -700,24 +724,25 @@ static ssize_t assign_adapter_store(struct device *dev, * number (APID). The bits in the mask, from most significant to least * significant bit, correspond to APIDs 0-255. */ - mutex_lock(&matrix_dev->lock); + if (!mutex_trylock(&matrix_dev->lock)) + return -EBUSY; ret = vfio_ap_mdev_verify_queues_reserved_for_apid(matrix_mdev, apid); if (ret) goto done; - set_bit_inv(apid, matrix_mdev->matrix.apm); + memset(apm, 0, sizeof(apm)); + set_bit_inv(apid, apm); - ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev); + ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev, apm, + matrix_mdev->matrix.aqm); if (ret) - goto share_err; + goto done; + set_bit_inv(apid, matrix_mdev->matrix.apm); vfio_ap_mdev_manage_qlinks(matrix_mdev, LINK_APID, apid); ret = count; - goto done; -share_err: - clear_bit_inv(apid, matrix_mdev->matrix.apm); done: mutex_unlock(&matrix_dev->lock); @@ -821,7 +846,7 @@ vfio_ap_mdev_verify_queues_reserved_for_apqi(struct ap_matrix_mdev *matrix_mdev, * 4. -EADDRINUSE * An APQN derived from the cross product of the APQI being assigned * and the APIDs previously assigned is being used by another mediated - * matrix device + * matrix device, or the matrix_dev-> lock could not be acquired. */ static ssize_t assign_domain_store(struct device *dev, struct device_attribute *attr, @@ -829,6 +854,7 @@ static ssize_t assign_domain_store(struct device *dev, { int ret; unsigned long apqi; + DECLARE_BITMAP(aqm, AP_DOMAINS); struct mdev_device *mdev = mdev_from_dev(dev); struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); unsigned long max_apqi = matrix_mdev->matrix.aqm_max; @@ -843,24 +869,25 @@ static ssize_t assign_domain_store(struct device *dev, if (apqi > max_apqi) return -ENODEV; - mutex_lock(&matrix_dev->lock); + if (!mutex_trylock(&matrix_dev->lock)) + return -EBUSY; ret = vfio_ap_mdev_verify_queues_reserved_for_apqi(matrix_mdev, apqi); if (ret) goto done; - set_bit_inv(apqi, matrix_mdev->matrix.aqm); + memset(aqm, 0, sizeof(aqm)); + set_bit_inv(apqi, aqm); - ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev); + ret = vfio_ap_mdev_verify_no_sharing(matrix_mdev, + matrix_mdev->matrix.apm, aqm); if (ret) - goto share_err; + goto done; + set_bit_inv(apqi, matrix_mdev->matrix.aqm); vfio_ap_mdev_manage_qlinks(matrix_mdev, LINK_APQI, apqi); ret = count; - goto done; -share_err: - clear_bit_inv(apqi, matrix_mdev->matrix.aqm); done: mutex_unlock(&matrix_dev->lock); @@ -956,7 +983,8 @@ static ssize_t assign_control_domain_store(struct device *dev, * least significant, correspond to IDs 0 up to the one less than the * number of control domains that can be assigned. */ - mutex_lock(&matrix_dev->lock); + if (!mutex_trylock(&matrix_dev->lock)) + return -EBUSY; set_bit_inv(id, matrix_mdev->matrix.adm); mutex_unlock(&matrix_dev->lock); @@ -1457,3 +1485,15 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev) kfree(q); mutex_unlock(&matrix_dev->lock); } + +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm) +{ + int ret; + + if (!mutex_trylock(&matrix_dev->lock)) + return -EBUSY; + ret = vfio_ap_mdev_verify_no_sharing(NULL, apm, aqm); + mutex_unlock(&matrix_dev->lock); + + return ret; +} diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h index 4e5cc72fc0db..8b9b5255abfe 100644 --- a/drivers/s390/crypto/vfio_ap_private.h +++ b/drivers/s390/crypto/vfio_ap_private.h @@ -105,4 +105,6 @@ struct vfio_ap_queue { int vfio_ap_mdev_probe_queue(struct ap_device *queue); void vfio_ap_mdev_remove_queue(struct ap_device *queue); +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long *aqm); + #endif /* _VFIO_AP_PRIVATE_H_ */