Message ID | 51127B3F.2090200@mellanox.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
On 2/6/2013 9:48 AM, Yan Burman wrote: > When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was > no longer getting the server crashes, > so the reset of my tests were done using that point (it is somewhere > in the middle of 3.7.0-rc2). > +tom tucker I'd try going back a few kernels, like to 3.5.x and see if things are more stable. If you find a point that works, then git bisect might help identify the regression. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi. In case you're interested, I did the NFS/RDMA backports for OFED. I tested that NFS/RDMA in OFED 3.5 works on kernel 3.5, and also the RHEL 6.3 kernel. However, I did not test it with SRIOV. If you test it (OFED-3.5-rc6 was released last week), I'd like to know how it goes. Thanks. Jeff Becker On 02/06/2013 07:58 AM, Steve Wise wrote: > On 2/6/2013 9:48 AM, Yan Burman wrote: >> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was >> no longer getting the server crashes, >> so the reset of my tests were done using that point (it is somewhere >> in the middle of 3.7.0-rc2) >> > +tom tucker > > I'd try going back a few kernels, like to 3.5.x and see if things are > more stable. If you find a point that works, then git bisect might help > identify the regression. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote: > When killing mount command that got stuck: > ------------------------------------------- > > BUG: unable to handle kernel paging request at ffff880324dc7ff8 > IP: [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] > PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 8000000324dc7161 > Oops: 0003 [#1] PREEMPT SMP > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock > target_core_file target_core_pscsi target_core_mod configfs 8021q > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod > CPU 6 > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro > X8DTH-i/6/iF/6F/X8DTH > RIP: 0010:[<ffffffffa05f3dfb>] [<ffffffffa05f3dfb>] > rdma_read_xdr+0x8bb/0xd40 [svcrdma] > RSP: 0018:ffff880324c3dbf8 EFLAGS: 00010297 > RAX: ffff880324dc8000 RBX: 0000000000000001 RCX: ffff880324dd8428 > RDX: ffff880324dc7ff8 RSI: ffff880324dd8428 RDI: ffffffff81149618 > RBP: ffff880324c3dd78 R08: 000060f9c0000860 R09: 0000000000000001 > R10: ffff880324dd8000 R11: 0000000000000001 R12: ffff8806299dcb10 > R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000010 > FS: 0000000000000000(0000) GS:ffff88063fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: ffff880324dc7ff8 CR3: 0000000001a0b000 CR4: 00000000000007e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process nfsd (pid: 4744, threadinfo ffff880324c3c000, task ffff880330550000) > Stack: > ffff880324c3dc78 ffff880324c3dcd8 0000000000000282 ffff880631cec000 > ffff880324dd8000 ffff88062ed33040 0000000124c3dc48 ffff880324dd8000 > ffff88062ed33058 ffff880630ce2b90 ffff8806299e8000 0000000000000003 > Call Trace: > [<ffffffffa05f466e>] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma] > [<ffffffff81086540>] ? try_to_wake_up+0x2f0/0x2f0 > [<ffffffffa045963f>] svc_recv+0x3ef/0x4b0 [sunrpc] > [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] > [<ffffffffa0571e5d>] nfsd+0xad/0x130 [nfsd] > [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] > [<ffffffff81071df6>] kthread+0xd6/0xe0 > [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 > [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0 > [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00 > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00 > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 > RIP [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] > RSP <ffff880324c3dbf8> > CR2: ffff880324dc7ff8 > ---[ end trace 06d0384754e9609a ]--- > > > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer" > is responsible for the crash (it seems to be crashing in > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527) > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and > CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet. > > When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I > was no longer getting the server crashes, > so the reset of my tests were done using that point (it is somewhere > in the middle of 3.7.0-rc2). OK, so this part's clearly my fault--I'll work on a patch, but the rdma's use of the ->rq_pages array is pretty confusing. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/6/2013 4:24 PM, J. Bruce Fields wrote: > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote: >> When killing mount command that got stuck: >> ------------------------------------------- >> >> BUG: unable to handle kernel paging request at ffff880324dc7ff8 >> IP: [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] >> PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 8000000324dc7161 >> Oops: 0003 [#1] PREEMPT SMP >> Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm >> ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables >> iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables >> nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock >> target_core_file target_core_pscsi target_core_mod configfs 8021q >> bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net >> macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel >> kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core >> ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core >> mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd >> sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod >> CPU 6 >> Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro >> X8DTH-i/6/iF/6F/X8DTH >> RIP: 0010:[<ffffffffa05f3dfb>] [<ffffffffa05f3dfb>] >> rdma_read_xdr+0x8bb/0xd40 [svcrdma] >> RSP: 0018:ffff880324c3dbf8 EFLAGS: 00010297 >> RAX: ffff880324dc8000 RBX: 0000000000000001 RCX: ffff880324dd8428 >> RDX: ffff880324dc7ff8 RSI: ffff880324dd8428 RDI: ffffffff81149618 >> RBP: ffff880324c3dd78 R08: 000060f9c0000860 R09: 0000000000000001 >> R10: ffff880324dd8000 R11: 0000000000000001 R12: ffff8806299dcb10 >> R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000010 >> FS: 0000000000000000(0000) GS:ffff88063fc00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> CR2: ffff880324dc7ff8 CR3: 0000000001a0b000 CR4: 00000000000007e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process nfsd (pid: 4744, threadinfo ffff880324c3c000, task ffff880330550000) >> Stack: >> ffff880324c3dc78 ffff880324c3dcd8 0000000000000282 ffff880631cec000 >> ffff880324dd8000 ffff88062ed33040 0000000124c3dc48 ffff880324dd8000 >> ffff88062ed33058 ffff880630ce2b90 ffff8806299e8000 0000000000000003 >> Call Trace: >> [<ffffffffa05f466e>] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma] >> [<ffffffff81086540>] ? try_to_wake_up+0x2f0/0x2f0 >> [<ffffffffa045963f>] svc_recv+0x3ef/0x4b0 [sunrpc] >> [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] >> [<ffffffffa0571e5d>] nfsd+0xad/0x130 [nfsd] >> [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] >> [<ffffffff81071df6>] kthread+0xd6/0xe0 >> [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 >> [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0 >> [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 >> Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00 >> 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00 >> <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 >> RIP [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] >> RSP <ffff880324c3dbf8> >> CR2: ffff880324dc7ff8 >> ---[ end trace 06d0384754e9609a ]--- >> >> >> It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e >> "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer" >> is responsible for the crash (it seems to be crashing in >> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527) >> It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and >> CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet. >> >> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I >> was no longer getting the server crashes, >> so the reset of my tests were done using that point (it is somewhere >> in the middle of 3.7.0-rc2). > OK, so this part's clearly my fault--I'll work on a patch, but the > rdma's use of the ->rq_pages array is pretty confusing. Maybe Tom can shed some light? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Jeff Becker [mailto:jeffrey.c.becker@nasa.gov] > Sent: Wednesday, February 06, 2013 19:07 > To: Steve Wise > Cc: Yan Burman; bfields@fieldses.org; linux-nfs@vger.kernel.org; linux- > rdma@vger.kernel.org; Or Gerlitz; Tom Tucker > Subject: Re: NFS over RDMA crashing > > Hi. In case you're interested, I did the NFS/RDMA backports for OFED. I > tested that NFS/RDMA in OFED 3.5 works on kernel 3.5, and also the RHEL > 6.3 kernel. However, I did not test it with SRIOV. If you test it > (OFED-3.5-rc6 was released last week), I'd like to know how it goes. Thanks. > > Jeff Becker > > On 02/06/2013 07:58 AM, Steve Wise wrote: > > On 2/6/2013 9:48 AM, Yan Burman wrote: > >> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I > was > >> no longer getting the server crashes, so the reset of my tests were > >> done using that point (it is somewhere in the middle of 3.7.0-rc2) > >> > > +tom tucker > > > > I'd try going back a few kernels, like to 3.5.x and see if things are > > more stable. If you find a point that works, then git bisect might > > help identify the regression. > > -- Vanilla 3.5.7 seem to work OK out of the box. I will try 3.6 next week + performance comparisons. Yan -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/6/13 3:28 PM, Steve Wise wrote: > On 2/6/2013 4:24 PM, J. Bruce Fields wrote: >> On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote: >>> When killing mount command that got stuck: >>> ------------------------------------------- >>> >>> BUG: unable to handle kernel paging request at ffff880324dc7ff8 >>> IP: [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] >>> PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 8000000324dc7161 >>> Oops: 0003 [#1] PREEMPT SMP >>> Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm >>> ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables >>> iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables >>> nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock >>> target_core_file target_core_pscsi target_core_mod configfs 8021q >>> bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net >>> macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel >>> kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core >>> ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core >>> mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd >>> sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod >>> CPU 6 >>> Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro >>> X8DTH-i/6/iF/6F/X8DTH >>> RIP: 0010:[<ffffffffa05f3dfb>] [<ffffffffa05f3dfb>] >>> rdma_read_xdr+0x8bb/0xd40 [svcrdma] >>> RSP: 0018:ffff880324c3dbf8 EFLAGS: 00010297 >>> RAX: ffff880324dc8000 RBX: 0000000000000001 RCX: ffff880324dd8428 >>> RDX: ffff880324dc7ff8 RSI: ffff880324dd8428 RDI: ffffffff81149618 >>> RBP: ffff880324c3dd78 R08: 000060f9c0000860 R09: 0000000000000001 >>> R10: ffff880324dd8000 R11: 0000000000000001 R12: ffff8806299dcb10 >>> R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000010 >>> FS: 0000000000000000(0000) GS:ffff88063fc00000(0000) >>> knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> CR2: ffff880324dc7ff8 CR3: 0000000001a0b000 CR4: 00000000000007e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> Process nfsd (pid: 4744, threadinfo ffff880324c3c000, task >>> ffff880330550000) >>> Stack: >>> ffff880324c3dc78 ffff880324c3dcd8 0000000000000282 ffff880631cec000 >>> ffff880324dd8000 ffff88062ed33040 0000000124c3dc48 ffff880324dd8000 >>> ffff88062ed33058 ffff880630ce2b90 ffff8806299e8000 0000000000000003 >>> Call Trace: >>> [<ffffffffa05f466e>] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma] >>> [<ffffffff81086540>] ? try_to_wake_up+0x2f0/0x2f0 >>> [<ffffffffa045963f>] svc_recv+0x3ef/0x4b0 [sunrpc] >>> [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] >>> [<ffffffffa0571e5d>] nfsd+0xad/0x130 [nfsd] >>> [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] >>> [<ffffffff81071df6>] kthread+0xd6/0xe0 >>> [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 >>> [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0 >>> [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 >>> Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00 >>> 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00 >>> <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 >>> RIP [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] >>> RSP <ffff880324c3dbf8> >>> CR2: ffff880324dc7ff8 >>> ---[ end trace 06d0384754e9609a ]--- >>> >>> >>> It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e >>> "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer" >>> is responsible for the crash (it seems to be crashing in >>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527) >>> It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and >>> CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet. >>> >>> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I >>> was no longer getting the server crashes, >>> so the reset of my tests were done using that point (it is somewhere >>> in the middle of 3.7.0-rc2). >> OK, so this part's clearly my fault--I'll work on a patch, but the >> rdma's use of the ->rq_pages array is pretty confusing. > > Maybe Tom can shed some light? Yes, the RDMA transport has two confusing tweaks on rq_pages. Most transports (UDP/TCP) use the rq_pages allocated by SVC. For RDMA, however, the RQ already contains pre-allocated memory that will contain inbound NFS requests from the client. Instead of copying this data from the per-registered receive buffer into the buffer in rq_pages, I just replace the page in rq_pages with the one that already contains the data. The second somewhat strange thing is that the NFS request contains an NFSRDMA header. This is just like TCP (i.e. the 4B length), however, the difference is that (unlike TCP) this header is needed for the response because it maps out where in the client the response data will be written. Tom > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
============================================= [ INFO: possible recursive locking detected ] 3.8.0-rc5+ #4 Not tainted --------------------------------------------- kworker/6:0/49 is trying to acquire lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa05e7813>] rdma_destroy_id+0x33/0x250 [rdma_cm] but task is already holding lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa05e317b>] cma_disable_callback+0x2b/0x60 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/6:0/49: #0: (ib_cm){.+.+.+}, at: [<ffffffff81068f50>] process_one_work+0x160/0x720 #1: ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81068f50>] process_one_work+0x160/0x720 #2: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa05e317b>] cma_disable_callback+0x2b/0x60 [rdma_cm] stack backtrace: Pid: 49, comm: kworker/6:0 Not tainted 3.8.0-rc5+ #4 Call Trace: [<ffffffff8109f99c>] validate_chain+0xdcc/0x11f0 [<ffffffff8109bdcf>] ? save_trace+0x3f/0xc0 [<ffffffff810a0760>] __lock_acquire+0x440/0xc30 [<ffffffff810a0760>] ? __lock_acquire+0x440/0xc30 [<ffffffff810a0fe5>] lock_acquire+0x95/0x1e0 [<ffffffffa05e7813>] ? rdma_destroy_id+0x33/0x250 [rdma_cm] [<ffffffffa05e7813>] ? rdma_destroy_id+0x33/0x250 [rdma_cm] [<ffffffff814a9aff>] mutex_lock_nested+0x5f/0x3b0 [<ffffffffa05e7813>] ? rdma_destroy_id+0x33/0x250 [rdma_cm] [<ffffffff8109d68d>] ? trace_hardirqs_on_caller+0x10d/0x1a0 [<ffffffff8109d72d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff814aca3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80 [<ffffffffa05e7813>] rdma_destroy_id+0x33/0x250 [rdma_cm] [<ffffffffa05e8f99>] cma_req_handler+0x719/0x730 [rdma_cm] [<ffffffff814aca04>] ? _raw_spin_unlock_irqrestore+0x4/0x80 [<ffffffffa05d5772>] cm_process_work+0x22/0x170 [ib_cm] [<ffffffffa05d6acd>] cm_req_handler+0x67d/0xa70 [ib_cm] [<ffffffffa05d6fed>] cm_work_handler+0x12d/0x1218 [ib_cm] [<ffffffff81068fc2>] process_one_work+0x1d2/0x720 [<ffffffff81068f50>] ? process_one_work+0x160/0x720 [<ffffffffa05d6ec0>] ? cm_req_handler+0xa70/0xa70 [ib_cm] [<ffffffff81069930>] worker_thread+0x120/0x460 [<ffffffff814ab4b4>] ? preempt_schedule+0x44/0x60 [<ffffffff81069810>] ? manage_workers+0x300/0x300 [<ffffffff81071df6>] kthread+0xd6/0xe0 [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0 [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 When killing mount command that got stuck: ------------------------------------------- BUG: unable to handle kernel paging request at ffff880324dc7ff8 IP: [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 8000000324dc7161 Oops: 0003 [#1] PREEMPT SMP Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs 8021q bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod CPU 6 Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro X8DTH-i/6/iF/6F/X8DTH RIP: 0010:[<ffffffffa05f3dfb>] [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] RSP: 0018:ffff880324c3dbf8 EFLAGS: 00010297 RAX: ffff880324dc8000 RBX: 0000000000000001 RCX: ffff880324dd8428 RDX: ffff880324dc7ff8 RSI: ffff880324dd8428 RDI: ffffffff81149618 RBP: ffff880324c3dd78 R08: 000060f9c0000860 R09: 0000000000000001 R10: ffff880324dd8000 R11: 0000000000000001 R12: ffff8806299dcb10 R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff88063fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880324dc7ff8 CR3: 0000000001a0b000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process nfsd (pid: 4744, threadinfo ffff880324c3c000, task ffff880330550000) Stack: ffff880324c3dc78 ffff880324c3dcd8 0000000000000282 ffff880631cec000 ffff880324dd8000 ffff88062ed33040 0000000124c3dc48 ffff880324dd8000 ffff88062ed33058 ffff880630ce2b90 ffff8806299e8000 0000000000000003 Call Trace: [<ffffffffa05f466e>] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma] [<ffffffff81086540>] ? try_to_wake_up+0x2f0/0x2f0 [<ffffffffa045963f>] svc_recv+0x3ef/0x4b0 [sunrpc] [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] [<ffffffffa0571e5d>] nfsd+0xad/0x130 [nfsd] [<ffffffffa0571db0>] ? nfsd_svc+0x740/0x740 [nfsd] [<ffffffff81071df6>] kthread+0xd6/0xe0 [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 [<ffffffff814b462c>] ret_from_fork+0x7c/0xb0 [<ffffffff81071d20>] ? __init_kthread_worker+0x70/0x70 Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00 <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 RIP [<ffffffffa05f3dfb>] rdma_read_xdr+0x8bb/0xd40 [svcrdma] RSP <ffff880324c3dbf8> CR2: ffff880324dc7ff8 ---[ end trace 06d0384754e9609a ]--- It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer" is responsible for the crash (it seems to be crashing in net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527) It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet. When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was no longer getting the server crashes, so the reset of my tests were done using that point (it is somewhere in the middle of 3.7.0-rc2). Now, I was getting the client stuck on running following messages all over again: rpcrdma: connection to 192.168.20.210:2050 on mlx4_0, memreg 6 slots 32 ird 16 rpcrdma: connection to 192.168.20.210:2050 closed (-103) The next step was to change the memory policy to RPCRDMA_ALLPHYSICAL, since there were IB_WC_LOC_ACCESS_ERR errors. I found that doing "echo 6 > /proc/sys/sunrpc/rdma_memreg_strategy" was not enough. We had to change the code a little bit for it to work. Here's the change for the test: diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index 62e4f9b..c660f61 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -970,6 +970,8 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) * NB: iWARP requires remote write access for the data sink * of an RDMA_READ. IB does not. */ + devattr.device_cap_flags &= ~(IB_DEVICE_MEM_MGT_EXTENSIONS|IB_DEVICE_LOCAL_DMA_LKEY); + if (devattr.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) { newxprt->sc_frmr_pg_list_len = devattr.max_fast_reg_page_list_len; diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 745973b..0e3da28 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -488,10 +488,12 @@ rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr *addr, int memreg) goto out2; } +#if 0 if (devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) { ia->ri_have_dma_lkey = 1; ia->ri_dma_lkey = ia->ri_id->device->local_dma_lkey; } +#endif switch (memreg) {