Message ID | 20190111174047.22994-1-hmadhani@marvell.com (mailing list archive) |
---|---|
State | Mainlined |
Commit | f0783d43dde4bb349fcd667df0afabbdbab8b477 |
Headers | show |
Series | qla2xxx: Use correct number of vectors for online CPUs | expand |
On Fri, 2019-01-11 at 09:40 -0800, Himanshu Madhani wrote: > From: Ming Lei <ming.lei@redhat.com> > > When SCSI-MQ is enabled, in some case system would present > nr_possible_cpus() which is greater than requested vectors > by the driver. This results into driver being able to get > larger number of MSI-X vectors than actual online CPUs. > Driver then uses pci_alloc_irq_vectors_affinity() to assign > 1:1 mapping and affinity for each MSI-x vector to CPUs. When > the command is submitted using MSI-x vector, assigned to > offline CPU, it results in an ABTS and system hang. This hang > is result of a driver not being able to process interrupt on a > vector assigned to an Off-line CPUs > > This patch fixes this issue by setting irq_offset value for the > blk_mq_pci_map_queues() to use only those CPUs which has CPU mask > affinity assigned and are online. By using the irq_offset value, > driver will allow online cpumask to decide which vectors are used > in blk_mq_pci_map_queues(). > > Fixes: 5601236b6f794 ("scsi: qla2xxx: Add Block Multi Queue functionality.") > Cc: <stable@vger.kernel.org> #4.19 > Signed-off-by: Ming Lei <ming.lei@redhat.com> > Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> > Tested-by: Himanshu Madhani <hmadhani@marvell.com> > Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> > --- > Hi Martin, > > With SCSI-MQ set as default starting 4.19, we found regression where systems with lower > online CPUs but higer possible CPUs in a system resulted in the driver assigning MSIx > vectors to offline CPU causing system hang. > > Please apply this patch to 5.0/fixes branch at your earliest convenience for inclusion > in 5.0-rc3. > > Thanks, > Himanshu > --- > drivers/scsi/qla2xxx/qla_def.h | 2 ++ > drivers/scsi/qla2xxx/qla_isr.c | 1 + > drivers/scsi/qla2xxx/qla_os.c | 2 +- > 3 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h > index 26b93c563f92..d1fc4958222a 100644 > --- a/drivers/scsi/qla2xxx/qla_def.h > +++ b/drivers/scsi/qla2xxx/qla_def.h > @@ -4394,6 +4394,8 @@ typedef struct scsi_qla_host { > uint16_t n2n_id; > struct list_head gpnid_list; > struct fab_scan scan; > + > + unsigned int irq_offset; > } scsi_qla_host_t; > > struct qla27xx_image_status { > diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c > index 30d3090842f8..8507c43b918c 100644 > --- a/drivers/scsi/qla2xxx/qla_isr.c > +++ b/drivers/scsi/qla2xxx/qla_isr.c > @@ -3446,6 +3446,7 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) > "Adjusted Max no of queues pairs: %d.\n", ha->max_qpairs); > } > } > + vha->irq_offset = desc.pre_vectors; > ha->msix_entries = kcalloc(ha->msix_count, > sizeof(struct qla_msix_entry), > GFP_KERNEL); > diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c > index ea69dafc9774..c6ef83d0d99b 100644 > --- a/drivers/scsi/qla2xxx/qla_os.c > +++ b/drivers/scsi/qla2xxx/qla_os.c > @@ -6939,7 +6939,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost) > if (USER_CTRL_IRQ(vha->hw)) > rc = blk_mq_map_queues(qmap); > else > - rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, 0); > + rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, vha->irq_offset); > return rc; > } > I have such a machine, SCSI-MQ doesn't work with its 8Gb adapter without this. Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Himanshu, > This patch fixes this issue by setting irq_offset value for the > blk_mq_pci_map_queues() to use only those CPUs which has CPU mask > affinity assigned and are online. By using the irq_offset value, > driver will allow online cpumask to decide which vectors are used in > blk_mq_pci_map_queues(). Applied to 5.0/scsi-fixes, thanks!
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 26b93c563f92..d1fc4958222a 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -4394,6 +4394,8 @@ typedef struct scsi_qla_host { uint16_t n2n_id; struct list_head gpnid_list; struct fab_scan scan; + + unsigned int irq_offset; } scsi_qla_host_t; struct qla27xx_image_status { diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 30d3090842f8..8507c43b918c 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3446,6 +3446,7 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) "Adjusted Max no of queues pairs: %d.\n", ha->max_qpairs); } } + vha->irq_offset = desc.pre_vectors; ha->msix_entries = kcalloc(ha->msix_count, sizeof(struct qla_msix_entry), GFP_KERNEL); diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index ea69dafc9774..c6ef83d0d99b 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -6939,7 +6939,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost) if (USER_CTRL_IRQ(vha->hw)) rc = blk_mq_map_queues(qmap); else - rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, 0); + rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, vha->irq_offset); return rc; }