Message ID | 20230221095708.29094-1-njavali@marvell.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v2] nvme-fc: initialize nvme fc ctrl ops | expand |
On Tue, Feb 21, 2023 at 01:57:08AM -0800, Nilesh Javali wrote:
> CPU: 61 PID: 6064 Comm: nvme Kdump: loaded Not tainted 6.2.0-rc1 #3
Well, that's a reall old -rc.
This should be fixed by:
commit 98e3528012cd571c48bbae7c7c0f868823254b6c
Author: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Fri Jan 20 17:43:54 2023 +0000
nvme-fc: fix initialization order
Christoph, Thanks for pointing to the commit. I do not see this commit in Martin's tree 6.3/scsi-staging or 6.3/scsi-queue branches. Martin, The 6.3/scsi-staging or 6.3/scsi-queue branches are still at 6.2.0-rc1. That could be the reason we hit the NVMe discovery NULL pointer dereference issue. Any plans to pull the below commit to 6.3/scsi-staging or 6.3/scsi-queue branches. Or am I missing something here. Thanks, Nilesh > -----Original Message----- > From: Christoph Hellwig <hch@lst.de> > Sent: Tuesday, February 21, 2023 11:05 PM > To: Nilesh Javali <njavali@marvell.com> > Cc: linux-nvme@lists.infradead.org; martin.petersen@oracle.com; linux- > scsi@vger.kernel.org; GR-QLogic-Storage-Upstream <GR-QLogic-Storage- > Upstream@marvell.com>; Bikash Hazarika <bhazarika@marvell.com>; Anil > Gurumurthy <agurumurthy@marvell.com>; Shreyas Deodhar > <sdeodhar@marvell.com>; Christoph Hellwig <hch@lst.de> > Subject: [EXT] Re: [PATCH v2] nvme-fc: initialize nvme fc ctrl ops > > External Email > > ---------------------------------------------------------------------- > On Tue, Feb 21, 2023 at 01:57:08AM -0800, Nilesh Javali wrote: > > CPU: 61 PID: 6064 Comm: nvme Kdump: loaded Not tainted 6.2.0-rc1 #3 > > Well, that's a reall old -rc. > > This should be fixed by: > > commit 98e3528012cd571c48bbae7c7c0f868823254b6c > Author: Ross Lagerwall <ross.lagerwall@citrix.com> > Date: Fri Jan 20 17:43:54 2023 +0000 > > nvme-fc: fix initialization order
On Wed, Feb 22, 2023 at 05:59:54AM +0000, Nilesh Javali wrote: > Christoph, > > Thanks for pointing to the commit. > I do not see this commit in Martin's tree 6.3/scsi-staging or 6.3/scsi-queue branches. > > Martin, > > The 6.3/scsi-staging or 6.3/scsi-queue branches are still at 6.2.0-rc1. > That could be the reason we hit the NVMe discovery NULL pointer dereference issue. > Any plans to pull the below commit to 6.3/scsi-staging or 6.3/scsi-queue branches. > Or am I missing something here. Hello Nilesh, What you are missing is that NVMe is not SCSI :) Consult the MAINTAINER file to see the correct (git) tree to climb: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS?h=v6.2#n14922 While SCSI sends pull requests straight to Linus, NVMe sends pull requests first to Jens, who then sends it to Linus. Kind regards, Niklas > > Thanks, > Nilesh > > > -----Original Message----- > > From: Christoph Hellwig <hch@lst.de> > > Sent: Tuesday, February 21, 2023 11:05 PM > > To: Nilesh Javali <njavali@marvell.com> > > Cc: linux-nvme@lists.infradead.org; martin.petersen@oracle.com; linux- > > scsi@vger.kernel.org; GR-QLogic-Storage-Upstream <GR-QLogic-Storage- > > Upstream@marvell.com>; Bikash Hazarika <bhazarika@marvell.com>; Anil > > Gurumurthy <agurumurthy@marvell.com>; Shreyas Deodhar > > <sdeodhar@marvell.com>; Christoph Hellwig <hch@lst.de> > > Subject: [EXT] Re: [PATCH v2] nvme-fc: initialize nvme fc ctrl ops > > > > External Email > > > > ---------------------------------------------------------------------- > > On Tue, Feb 21, 2023 at 01:57:08AM -0800, Nilesh Javali wrote: > > > CPU: 61 PID: 6064 Comm: nvme Kdump: loaded Not tainted 6.2.0-rc1 #3 > > > > Well, that's a reall old -rc. > > > > This should be fixed by: > > > > commit 98e3528012cd571c48bbae7c7c0f868823254b6c > > Author: Ross Lagerwall <ross.lagerwall@citrix.com> > > Date: Fri Jan 20 17:43:54 2023 +0000 > > > > nvme-fc: fix initialization order >
Hi Nilesh! > The 6.3/scsi-staging or 6.3/scsi-queue branches are still at > 6.2.0-rc1. That could be the reason we hit the NVMe discovery NULL > pointer dereference issue. Any plans to pull the below commit to > 6.3/scsi-staging or 6.3/scsi-queue branches. Or am I missing > something here. Except in very rare circumstances, the SCSI submission trees stay at -rc1 forever. I generally don't bring in stuff from other trees to avoid problems if those trees subsequently have to rebase. It sounds like you should be testing either linux-next or maybe a local ephemeral integration branch featuring the various topic areas that are important to you (SCSI fixes + staging, block, NVMe).
Niklas and Martin, Thank you very much for the pointers. Thanks, Nilesh > -----Original Message----- > From: Martin K. Petersen <martin.petersen@oracle.com> > Sent: Wednesday, February 22, 2023 10:34 PM > To: Nilesh Javali <njavali@marvell.com> > Cc: Christoph Hellwig <hch@lst.de>; martin.petersen@oracle.com; linux- > nvme@lists.infradead.org; linux-scsi@vger.kernel.org; GR-QLogic-Storage- > Upstream <GR-QLogic-Storage-Upstream@marvell.com>; Bikash Hazarika > <bhazarika@marvell.com>; Anil Gurumurthy <agurumurthy@marvell.com>; > Shreyas Deodhar <sdeodhar@marvell.com>; Shreya Jeurkar > <sjeurkar@marvell.com>; Jeetendra Sonar <jsonar@marvell.com> > Subject: Re: [EXT] Re: [PATCH v2] nvme-fc: initialize nvme fc ctrl ops > > > Hi Nilesh! > > > The 6.3/scsi-staging or 6.3/scsi-queue branches are still at > > 6.2.0-rc1. That could be the reason we hit the NVMe discovery NULL > > pointer dereference issue. Any plans to pull the below commit to > > 6.3/scsi-staging or 6.3/scsi-queue branches. Or am I missing > > something here. > > Except in very rare circumstances, the SCSI submission trees stay at > -rc1 forever. I generally don't bring in stuff from other trees to avoid > problems if those trees subsequently have to rebase. > > It sounds like you should be testing either linux-next or maybe a local > ephemeral integration branch featuring the various topic areas that are > important to you (SCSI fixes + staging, block, NVMe). > > -- > Martin K. Petersen Oracle Linux Engineering
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 4564f16a0b20..53297cad49ea 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3521,13 +3521,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, nvme_fc_init_queue(ctrl, 0); - ret = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set, - &nvme_fc_admin_mq_ops, - struct_size((struct nvme_fcp_op_w_sgl *)NULL, priv, - ctrl->lport->ops->fcprqst_priv_sz)); - if (ret) - goto out_free_queues; - /* * Would have been nice to init io queues tag set as well. * However, we require interaction from the controller @@ -3537,7 +3530,14 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0); if (ret) - goto out_cleanup_tagset; + goto out_free_queues; + + ret = nvme_alloc_admin_tag_set(&ctrl->ctrl, &ctrl->admin_tag_set, + &nvme_fc_admin_mq_ops, + struct_size((struct nvme_fcp_op_w_sgl *)NULL, priv, + ctrl->lport->ops->fcprqst_priv_sz)); + if (ret) + goto out_free_queues; /* at this point, teardown path changes to ref counting on nvme ctrl */ @@ -3592,8 +3592,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, return ERR_PTR(-EIO); -out_cleanup_tagset: - nvme_remove_admin_tag_set(&ctrl->ctrl); out_free_queues: kfree(ctrl->queues); out_free_ida:
The system crashed while performing qla2xxx nvme discovery with below call trace, qla2xxx [0000:21:00.0]-2102:12: qla_nvme_register_remote: traddr=nn-0x245e00a098f4684a:pn-0x245f00a098f4684a PortID:5a247a qla2xxx [0000:21:00.0]-2102:12: qla_nvme_register_remote: traddr=nn-0x245e00a098f4684a:pn-0x246100a098f4684a PortID:5a2d6e BUG: kernel NULL pointer dereference, address: 0000000000000010 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 61 PID: 6064 Comm: nvme Kdump: loaded Not tainted 6.2.0-rc1 #3 Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.5.6 10/06/2021 RIP: 0010:nvme_alloc_admin_tag_set+0x51/0x120 [nvme_core] Code: 00 00 00 00 81 c1 b0 00 00 00 48 c7 86 a8 00 00 00 00 00 00 00 c1 e9 03 f3 48 ab 4c 89 46 38 c7 46 44 1e 00 00 00 48 8b 45 30 <f6> 40 10 01 74 07 c7 46 48 01 00 00 00 8b 45 5c c7 43 58 40 00 00 RSP: 0018:ffffafe6cd7cbd10 EFLAGS: 00010212 RAX: 0000000000000000 RBX: ffff898e0c39c050 RCX: 0000000000000000 RDX: 00000000000001d8 RSI: ffff898e0c39c050 RDI: ffff898e0c39c100 RBP: ffff898e0c39c398 R08: ffffffffc0afe1a0 R09: ffff896ed602a600 R10: 0000000000000010 R11: f000000000000000 R12: ffff898e06ea2600 R13: ffff898e070ffbc0 R14: ffff898e0c39c040 R15: ffff898e0c39c398 FS: 00007f9368279780(0000) GS:ffff89ad7fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000210c570000 CR4: 0000000000350ee0 Call Trace: nvme_fc_init_ctrl+0x328/0x460 [nvme_fc] nvme_fc_create_ctrl+0x1b0/0x260 [nvme_fc] nvmf_create_ctrl+0x141/0x240 [nvme_fabrics] nvmf_dev_write+0x81/0xe0 [nvme_fabrics] vfs_write+0xc5/0x3b0 ? syscall_exit_work+0x103/0x130 ? syscall_exit_to_user_mode+0x12/0x30 ksys_write+0x5f/0xe0 do_syscall_64+0x5c/0x90 ? exc_page_fault+0x62/0x150 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f936813e967 Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 SP: 002b:00007fff4197a468 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000560862447d40 RCX: 00007f936813e967 RDX: 000000000000012b RSI: 0000560862447d40 RDI: 0000000000000003 RBP: 0000000000000003 R08: 000000000000012b R09: 0000560862447d40 R10: 0000000000000000 R11: 0000000000000246 R12: 00005608624473f0 R13: 000000000000012b R14: 00007f93682e6100 R15: 00007f93682e613d Initialize the nvme_fc_ctrl_ops before allocating the nvme admin tag set. Fixes: 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali <njavali@marvell.com> --- v2: - correct the cleanup path - Add Cc tag drivers/nvme/host/fc.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)