From patchwork Sun Sep 24 07:38:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sagi Grimberg X-Patchwork-Id: 9967817 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A7B70602CB for ; Sun, 24 Sep 2017 07:39:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FB0428DBE for ; Sun, 24 Sep 2017 07:39:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8397228E1E; Sun, 24 Sep 2017 07:39:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E9E628DBE for ; Sun, 24 Sep 2017 07:39:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751149AbdIXHio (ORCPT ); Sun, 24 Sep 2017 03:38:44 -0400 Received: from mail-wr0-f181.google.com ([209.85.128.181]:55212 "EHLO mail-wr0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751021AbdIXHin (ORCPT ); Sun, 24 Sep 2017 03:38:43 -0400 Received: by mail-wr0-f181.google.com with SMTP id g29so3282766wrg.11 for ; Sun, 24 Sep 2017 00:38:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8MvZdrTsoSDI2ifB+rcXbPC9ZwA6TlVWvErdjsMUIIc=; b=f98L8eI+Smy+XB54cGxK8xx75Vk0QRfSvW8Bm9sDEzC6+6qg3tJxwXpOmiUpV5rfDc jhlrgObO0v6M34vKtlHOeqx1TLLuYIWvl7eg4PYw0D3PcpTm7IYsryHE2d1QEkxURYKP 2yS78QFRtiFOrIf64PRwd2mpzlChkBFBPTb70BF0UXjl/wFDQbG17uJX9QptQvX+dCz4 SleJ7UH0MIRZm9MCrIRBcxh4y48kZAlMNl9yNQ+gtyhwEja2+E8JF7kuVsupo2nbYPt2 IkZP8QjVWIqwEURKpzvAfpUvJNobAlT0yuJdLGDpWVm1e4QhSGuInbzzB13sgUeFpzyq Cz+g== X-Gm-Message-State: AHPjjUjAG0HR4rlgLbJMJoJzxURCfw8nNdl96VlrJ2hH9qZ4om/J9nAr sEpnd/wRLDhP3cRaEh/AaeM= X-Google-Smtp-Source: AOwi7QAV0dPJFQm+WylNCf/HXwr1pV1OFm9BVI3tFZRhBXo7mI6g8XZg7cE8OJHdiekLf2ik4JckfA== X-Received: by 10.223.174.198 with SMTP id y64mr3203246wrc.101.1506238722504; Sun, 24 Sep 2017 00:38:42 -0700 (PDT) Received: from [192.168.64.116] (bzq-82-81-101-184.red.bezeqint.net. [82.81.101.184]) by smtp.gmail.com with ESMTPSA id b47sm5010435wra.73.2017.09.24.00.38.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Sep 2017 00:38:41 -0700 (PDT) Subject: Re: nvmeof rdma regression issue on 4.14.0-rc1 (or maybe mlx4?) To: Christoph Hellwig , Yi Zhang Cc: linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org References: <1735134433.8514119.1505997532669.JavaMail.zimbra@redhat.com> <1215229914.8516804.1505998051674.JavaMail.zimbra@redhat.com> <20170921144421.GA15285@infradead.org> From: Sagi Grimberg Message-ID: <47493aa0-4cad-721b-4ea2-c3b2293340aa@grimberg.me> Date: Sun, 24 Sep 2017 10:38:39 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170921144421.GA15285@infradead.org> Content-Language: en-US Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > Adding linux-rdma, the dma mappings happen in the mlx4 driver ... >> [ 293.209662] DMAR: ERROR: DMA PTE for vPFN 0xe0f59 already set (to 10369a9001 not 10115ed001) >> [ 293.219117] ------------[ cut here ]------------ >> [ 293.224284] WARNING: CPU: 14 PID: 751 at drivers/iommu/intel-iommu.c:2305 __domain_mapping+0x367/0x380 >> [ 293.234698] Modules linked in: nvme_rdma nvme_fabrics nvme_core sch_mqprio ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge 8021q garp mrp stp llc rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl ipmi_ssif sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore iTCO_wdt ipmi_si intel_rapl_perf iTCO_vendor_support ipmi_devintf dcdbas sg pcspkr ipmi_msghandler ioatdma mei_me mei dca shpchp lpc_ich acpi_pad acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs libcrc32c mlx4_en sd_mod >> [ 293.313884] mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core tg3 ahci libahci ptp libata i2c_core crc32c_intel devlink pps_core dm_mirror dm_region_hash dm_log dm_mod >> [ 293.335583] CPU: 14 PID: 751 Comm: kworker/u369:7 Not tainted 4.14.0-rc1 #2 >> [ 293.343374] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.6.2 01/08/2016 >> [ 293.351750] Workqueue: nvme-wq nvme_rdma_reconnect_ctrl_work [nvme_rdma] >> [ 293.359249] task: ffff881032ecdd00 task.stack: ffffc900084d8000 >> [ 293.365873] RIP: 0010:__domain_mapping+0x367/0x380 >> [ 293.371230] RSP: 0018:ffffc900084dbc60 EFLAGS: 00010202 >> [ 293.377075] RAX: 0000000000000004 RBX: 00000010115ed001 RCX: 0000000000000000 >> [ 293.385056] RDX: 0000000000000000 RSI: ffff88103e7ce038 RDI: ffff88103e7ce038 >> [ 293.393040] RBP: ffffc900084dbcc0 R08: 0000000000000000 R09: 0000000000000000 >> [ 293.401024] R10: 00000000000002f7 R11: 00000000010115ed R12: ffff88103b9e1ac8 >> [ 293.409744] R13: 0000000000000001 R14: 0000000000000001 R15: 00000000000e0f59 >> [ 293.418456] FS: 0000000000000000(0000) GS:ffff88103e7c0000(0000) knlGS:0000000000000000 >> [ 293.428229] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 293.435391] CR2: 0000154ecabc9140 CR3: 0000001005709001 CR4: 00000000001606e0 >> [ 293.444112] Call Trace: >> [ 293.447594] __intel_map_single+0xeb/0x180 >> [ 293.452918] intel_map_page+0x39/0x40 >> [ 293.457765] mlx4_ib_alloc_mr+0x141/0x220 [mlx4_ib] >> [ 293.463965] ib_alloc_mr+0x26/0x50 [ib_core] >> [ 293.469471] nvme_rdma_reinit_request+0x3a/0x70 [nvme_rdma] >> [ 293.476433] ? nvme_rdma_free_ctrl+0xb0/0xb0 [nvme_rdma] >> [ 293.483100] blk_mq_reinit_tagset+0x5c/0x90 >> [ 293.488508] nvme_rdma_configure_io_queues+0x211/0x290 [nvme_rdma] >> [ 293.496152] nvme_rdma_reconnect_ctrl_work+0x5b/0xd0 [nvme_rdma] >> [ 293.503598] process_one_work+0x149/0x360 >> [ 293.508815] worker_thread+0x4d/0x3c0 >> [ 293.513638] kthread+0x109/0x140 >> [ 293.517973] ? rescuer_thread+0x380/0x380 >> [ 293.523176] ? kthread_park+0x60/0x60 >> [ 293.527993] ret_from_fork+0x25/0x30 Is it possible that ib_dereg_mr failed? can you please apply the following patch and report if you see a warning? --- ctrl->max_fr_pages); -- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 92a03ff5fb4d..ef50b58b0bb6 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -274,7 +274,7 @@ static int nvme_rdma_reinit_request(void *data, struct request *rq) struct nvme_rdma_request *req = blk_mq_rq_to_pdu(rq); int ret = 0; - ib_dereg_mr(req->mr); + WARN_ON_ONCE(ib_dereg_mr(req->mr)); req->mr = ib_alloc_mr(dev->pd, IB_MR_TYPE_MEM_REG,