Message ID | 1491256275-27836-1-git-send-email-mauricfo@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 1c9fa45df7eb..8befe841adaa 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -13758,7 +13758,14 @@ void lpfc_sli4_els_xri_abort_event_proc(struct lpfc_hba *phba) lpfc_free_rq_buffer(queue->phba, queue); kfree(queue->rqbp); } - kfree(queue->pring); + + /* + * The WQs/CQs' pring is bound (same pointer). + * So free it only once, and not again on WQ. + */ + if (queue->type != LPFC_WQ) + kfree(queue->pring); + kfree(queue); return; }
commit 895427bd012c ("scsi: lpfc: NVME Initiator: Base modifications") binds the CQs and WQs ring pointer (sets it to same address on both). lpfc_create_wq_cq(): ... rc = lpfc_cq_create(phba, cq, eq, <...>) ... rc = lpfc_wq_create(phba, wq, cq, qtype); ... /* Bind this CQ/WQ to the NVME ring */ pring = wq->pring; ... cq->pring = pring; ... The commit frees both CQ & WQ for FCP/NVME on lpfc_sli4_queue_destroy(), which causes a double free (potential corruption or panic) on freeing the ring pointer of the second entity (CQ is first, WQ is second): lpfc_pci_remove_one() # that is, .remove / .shutdown -> lpfc_pci_remove_one_s4() -> lpfc_sli4_hba_unset() -> lpfc_sli4_queue_destroy() -> lpfc_sli4_release_queues() # Release FCP/NVME cqs -> __lpfc_sli4_release_queue() -> lpfc_sli4_queue_free() -> kfree(queue->pring) # first free -> lpfc_sli4_release_queues() # Release FCP/NVME wqs -> __lpfc_sli4_release_queue() -> lpfc_sli4_queue_free() -> kfree(queue->pring) # second free So, check for WQs in lpfc_sli4_queue_free() and do not free the pring, as it is freed before in the bound CQ. [the WQs are created only via lpfc_wq_create(), which sets struct lpfc_queue::type == LPFC_WQ. And that happens in 2 sites (lpfc_create_wq_cq() & lpfc_fof_queue_setup()), both of which bind the CQs & WQs. Thus, checking for the LPFC_WQ type correlates to whether the WQ is bound to a CQ, which is freed first.] Additional details: For reference, that binding also occurs on one other function: lpfc_fof_queue_setup(): ... rc = lpfc_cq_create(phba, phba->sli4_hba.oas_cq, <...>) ... rc = lpfc_wq_create(phba, phba->sli4_hba.oas_wq, <...>) ... /* Bind this CQ/WQ to the NVME ring */ pring = phba->sli4_hba.oas_wq->pring; ... phba->sli4_hba.oas_cq->pring = pring; And used to occur similarly on lpfc_sli4_queue_setup(), but was changed by that commit; although the problem is more related to the new freeing pattern introduced in lpfc_sli4_queue_destroy() plus the bound CQs/WQs. - /* Bind this WQ to the next FCP ring */ - pring = &psli->ring[MAX_SLI3_CONFIGURED_RINGS + fcp_wqidx]; ... - phba->sli4_hba.fcp_cq[fcp_wqidx]->pring = pring; commit 85e8a23936ab ("scsi: lpfc: Add shutdown method for kexec") made this more likely as lpfc_pci_remove_one() is called on driver shutdown (e.g., modprobe -r / rmmod). (this patch is partially based on a different patch suggested by Johannes, thus adding a Suggested-by tag for due credit.) Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com> Suggested-by: Johannes Thumshirn <jthumshirn@suse.de> --- drivers/scsi/lpfc/lpfc_sli.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)