From patchwork Wed May 3 14:58:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sagi Grimberg X-Patchwork-Id: 9709959 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 628D960385 for ; Wed, 3 May 2017 14:59:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5386526E56 for ; Wed, 3 May 2017 14:59:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4817A2862B; Wed, 3 May 2017 14:59:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C346B26E56 for ; Wed, 3 May 2017 14:59:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754126AbdECO7O (ORCPT ); Wed, 3 May 2017 10:59:14 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:32890 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751275AbdECO7A (ORCPT ); Wed, 3 May 2017 10:59:00 -0400 Received: by mail-wr0-f194.google.com with SMTP id w50so23302549wrc.0 for ; Wed, 03 May 2017 07:58:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=4rP5IrragqM/dd2AHTAU3zgGyhm4nppU2so41M0Prz8=; b=AdUOutYcm1OJ2nO6VfWJzjiNM9lI1m5GqzglRNAIN2rV4FvrJVXL0UH6VTBORj2c5Q FDdTrAUeWTgUyNC8EiY4lIqLYSt6SyNW3XvYGIxI55LkwdLks8smZwOzRXUilopFT8+X fd9foNl4XWdyGVfA2jQEjj7PkWUQdnl9bFv5nsyfqZt0SVnZBXpJYaYynoEHtZNlDNMr 2KnDYx50wLQzCEQC8G5+JNwFlsNnEV/JsNRPKkL81JlX/kO+sDuxgLVXP68IihIIhC7U cgTkgCOR7Tjk0GTV3dS1rKKgwqqGsPt6S1nW7meA86UVcZQo7UKQufxrFYnDs3NJOpfg TYMA== X-Gm-Message-State: AN3rC/4f/Srmdm91zq/B3+e3bWhaZNqQbNTc7b5BRvAjahdL2nD+8ekg 0F5dgvWPwSUZvQ== X-Received: by 10.223.167.76 with SMTP id e12mr20429458wrd.177.1493823538686; Wed, 03 May 2017 07:58:58 -0700 (PDT) Received: from [192.168.64.116] (bzq-82-81-101-184.red.bezeqint.net. [82.81.101.184]) by smtp.gmail.com with ESMTPSA id i24sm13358747wrc.40.2017.05.03.07.58.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 May 2017 07:58:57 -0700 (PDT) Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array To: Laurence Oberman References: <8992bd28-667f-94b1-e582-106e6b41aa4b@sandisk.com> <20170425175849.GS14088@mtr-leonro.local> <438230391.2090966.1493152655709.JavaMail.zimbra@redhat.com> <20170426061640.GV14088@mtr-leonro.local> <501334895.4531615.1493820950718.JavaMail.zimbra@redhat.com> Cc: Leon Romanovsky , Bart Van Assche , Doug Ledford , Max Gurtovoy , Israel Rukshin , linux-rdma@vger.kernel.org From: Sagi Grimberg Message-ID: <374fcc74-4b84-610b-b55e-d385563bef6f@grimberg.me> Date: Wed, 3 May 2017 17:58:43 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <501334895.4531615.1493820950718.JavaMail.zimbra@redhat.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > Hello Sagi > Against Bart's tree again > > a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS > dfa5a2b mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array > f759c80 mlx5: Fix mlx5_ib_map_mr_sg mr lengt > > Above are all in > Added your most recent patch above > > Same behavior. > [ 579.368733] scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817de9c57b0 > [ 579.369875] mlx5_1:dump_cqe:262:(pid 15140): dump error cqe > [ 579.369877] 00000000 00000000 00000000 00000000 > [ 579.369877] 00000000 00000000 00000000 00000000 > [ 579.369878] 00000000 00000000 00000000 00000000 > [ 579.369878] 00000000 0f007806 2500002b 1c528dd0 > [ 579.369883] scsi host1: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff88179a460af8 > [ 594.814222] scsi host1: ib_srp: reconnect succeeded > [ 594.916876] scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817e1d4a6b0 > [ 595.494532] mlx5_1:dump_cqe:262:(pid 15205): dump error cqe > [ 595.525995] 00000000 00000000 00000000 00000000 > [ 595.552125] 00000000 00000000 00000000 00000000 > [ 595.578204] 00000000 00000000 00000000 00000000 > [ 595.603670] 00000000 0f007806 25000033 002d77d0 > ^C[ 610.821911] scsi host1: ib_srp: reconnect succeeded > [ 610.933298] scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817e1d4a170 > [ 611.514234] mlx5_1:dump_cqe:262:(pid 15242): dump error cqe > [ 611.543083] 00000000 00000000 00000000 00000000 > [ 611.568670] 00000000 00000000 00000000 00000000 > [ 611.594064] 00000000 00000000 00000000 00000000 > [ 611.620142] 00000000 0f007806 2500003b 003161d0 > > I will capture the function traces with your patch applied and the additional logging asked for by Max. Thanks, that would be helpful, Can you try the following patch, just to see if there is an off by 1 case: --- -- It's not a fix, but if it works it can give us a clue... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index b8f9382a8b7d..3d6ef7bce7d9 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1525,7 +1525,7 @@ struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd, { struct mlx5_ib_dev *dev = to_mdev(pd->device); int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); - int ndescs = ALIGN(max_num_sg, 4); + int ndescs = ALIGN(max_num_sg + 1, 4); struct mlx5_ib_mr *mr; void *mkc; u32 *in;