From patchwork Thu Oct 29 07:41:00 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: jackm X-Patchwork-Id: 7516991 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 570E29F399 for ; Thu, 29 Oct 2015 07:48:43 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5256B2087E for ; Thu, 29 Oct 2015 07:48:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2A0EC2081A for ; Thu, 29 Oct 2015 07:48:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756031AbbJ2Hsj (ORCPT ); Thu, 29 Oct 2015 03:48:39 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:35290 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756002AbbJ2Hsj (ORCPT ); Thu, 29 Oct 2015 03:48:39 -0400 X-Greylist: delayed 499 seconds by postgrey-1.27 at vger.kernel.org; Thu, 29 Oct 2015 03:48:38 EDT Received: by wmll128 with SMTP id l128so19700768wml.0 for ; Thu, 29 Oct 2015 00:48:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dev_mellanox_co_il.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-type:content-transfer-encoding; bh=0ISK9SePQi4BY9kHFdpPeGEtOwNNSyAtNjXw3djqGCk=; b=YaH/pyaaxr0P7c44o2EZ6lcrEZxVv2jqy6qGV9DWsqxCkD9YYFULF0jdZkQaUpvKgX mT56d/4EXod4R12XOCYPA7ICTChQXqNLUsv6CT5wzfYHpcxvLsPbqHb83vGfU3VJyX3r 6crkG6SytE22Pic69NF9zwnQuufFKT3nZ+MwbXC4wI1MZ9R4PY2bm9ENbu2ahgDm46n/ 0AMcxPzbV25ZbMtqTC0E21I2smYCREh0zIGEiNd+w/Z81yeQMhpQ+NMEb4jMpukqH8pJ wxAgtCbNqWX3oAV5KzFPFC4mhkSfy+xsOfbGzsb8Pbv0iI2ALXJZMQQtenZY4YL0WM6R VvWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-type :content-transfer-encoding; bh=0ISK9SePQi4BY9kHFdpPeGEtOwNNSyAtNjXw3djqGCk=; b=Ld6SDL1Ozoru6Cvp81wKfaNXoCAMFS3f8BYqHLFTcklm+KIY7exW8lMcfvxpu8vrmh UVPexSJndJF+5jEOmG89BvzGYuk6f54vUxM3NsDEmKLnBqPOUUm9+FjuWTO5CvEVsXrY A/Lz+8klOiW5UWdNcSdvRGk8f7S82doYcCaWwVQ3dP7Mv8ssZBYABanKtZyN8Bm4xwlA rnPko137cc1i9UF8C+m89GjBJFcCfggS5BhTtkLi+m5ulasv54lApk9MFIWwNC3C/YI7 QubngN/32KZDJQDsFjevlEtD0BYjE6CKb4SiuOnM2LtJei4TFtdAighwGgvr0TnccCS5 4ICw== X-Gm-Message-State: ALoCoQkO5y658QDAEihi1D93F0jOKQR00mdncVd2yUsSG1FSt1ocmQotbOrG5GQzaWloUT2E1flT X-Received: by 10.28.5.4 with SMTP id 4mr5096470wmf.22.1446104419051; Thu, 29 Oct 2015 00:40:19 -0700 (PDT) Received: from jpm-OptiPlex-GX620 ([193.47.165.251]) by smtp.gmail.com with ESMTPSA id ki7sm323232wjc.28.2015.10.29.00.40.17 (version=SSLv3 cipher=RC4-SHA bits=128/128); Thu, 29 Oct 2015 00:40:18 -0700 (PDT) Date: Thu, 29 Oct 2015 09:41:00 +0200 From: Jack Morgenstein To: Arnd Bergmann Cc: tglx@linutronix.de, Roland Dreier , Eli Cohen , Yevgeny Petrilin , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, matanb@mellanox.com, yishaih@mellanox.com, ogerlitz@mellanox.com, eranbe@mellanox.com, talal@mellanox.com Subject: Re: [PATCH 02/25] IB/mthca, net/mlx4: remove counting semaphores Message-ID: <20151029094100.4575e0c6@jpm-OptiPlex-GX620> In-Reply-To: <1446000373-1823620-3-git-send-email-arnd@arndb.de> References: <1446000373-1823620-1-git-send-email-arnd@arndb.de> <1446000373-1823620-3-git-send-email-arnd@arndb.de> Organization: Mellanox X-Mailer: Claws Mail 3.8.1 (GTK+ 2.24.17; x86_64-pc-linux-gnu) Mime-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, 28 Oct 2015 03:45:50 +0100 Arnd Bergmann wrote: > As far as I can tell, there is a preexisting race condition > regarding the cmd->use_events flag, which is not protected > by any lock. When this flag is toggled while another command > is being started, that command gets stuck until the mode is > toggled back. We fixed this issue in mellanox ofed in a manner that allowed keeping the same counting mechanism. IMHO, this is preferable, rather than totally changing the mechanism. We will submit a similar patch to the upstream kernel shortly. -Jack net/mlx4: Switching between sending commands via polling and events may results in hung tasks When switching between those methonds of sending commands, it's possbile that a task will keep waiting for the polling sempahore, but may never be able to acquire it. This is due to mlx4_cmd_use_events which "down"s the sempahore back to 0. Reproducing it involves in sending commands while chaning between mlx4_cmd_use_polling and mlx4_cmd_use_events. Solving it by adding a read-write semaphore when switching between modes. issue: 402565 Change-Id: I19f0d40dbb327c49b39a9abbcb2bb002b0279b0b Signed-off-by: Matan Barak --- drivers/net/ethernet/mellanox/mlx4/cmd.c | 23 +++++++++++++++++------ drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 ++ 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c index def1338..f94a960 100644 --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c @@ -766,17 +766,23 @@ int __mlx4_cmd(struct mlx4_dev *dev, u64 in_param, u64 *out_param, return mlx4_cmd_reset_flow(dev, op, op_modifier, -EIO); if (!mlx4_is_mfunc(dev) || (native && mlx4_is_master(dev))) { + int ret; + if (dev->persist->state & MLX4_DEVICE_STATE_INTERNAL_ERROR) return mlx4_internal_err_ret_value(dev, op, op_modifier); + down_read(&mlx4_priv(dev)->cmd.switch_sem); if (mlx4_priv(dev)->cmd.use_events) - return mlx4_cmd_wait(dev, in_param, out_param, - out_is_imm, in_modifier, - op_modifier, op, timeout); + ret = mlx4_cmd_wait(dev, in_param, out_param, + out_is_imm, in_modifier, + op_modifier, op, timeout); else - return mlx4_cmd_poll(dev, in_param, out_param, - out_is_imm, in_modifier, - op_modifier, op, timeout); + ret = mlx4_cmd_poll(dev, in_param, out_param, + out_is_imm, in_modifier, + op_modifier, op, timeout); + + up_read(&mlx4_priv(dev)->cmd.switch_sem); + return ret; } return mlx4_slave_cmd(dev, in_param, out_param, out_is_imm, in_modifier, op_modifier, op, timeout); @@ -2437,6 +2443,7 @@ int mlx4_cmd_init(struct mlx4_dev *dev) int flags = 0; if (!priv->cmd.initialized) { + init_rwsem(&priv->cmd.switch_sem); mutex_init(&priv->cmd.slave_cmd_mutex); sema_init(&priv->cmd.poll_sem, 1); priv->cmd.use_events = 0; @@ -2566,6 +2573,7 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev) if (!priv->cmd.context) return -ENOMEM; + down_write(&priv->cmd.switch_sem); for (i = 0; i < priv->cmd.max_cmds; ++i) { priv->cmd.context[i].token = i; priv->cmd.context[i].next = i + 1; @@ -2590,6 +2598,7 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev) down(&priv->cmd.poll_sem); priv->cmd.use_events = 1; + up_write(&priv->cmd.switch_sem); return err; } @@ -2602,6 +2611,7 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev) struct mlx4_priv *priv = mlx4_priv(dev); int i; + down_write(&priv->cmd.switch_sem); priv->cmd.use_events = 0; for (i = 0; i < priv->cmd.max_cmds; ++i) @@ -2610,6 +2620,7 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev) kfree(priv->cmd.context); up(&priv->cmd.poll_sem); + up_write(&priv->cmd.switch_sem); } struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev) diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h index 6c58021..2f03e6e 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h @@ -45,6 +45,7 @@ #include #include #include +#include #include #include @@ -626,6 +627,7 @@ struct mlx4_cmd { struct mutex slave_cmd_mutex; struct semaphore poll_sem; struct semaphore event_sem; + struct rw_semaphore switch_sem; int max_cmds; spinlock_t context_lock; int free_head;