From patchwork Wed Nov 11 12:00:31 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sreekanth Reddy X-Patchwork-Id: 7595461 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 30A2C9F2E9 for ; Wed, 11 Nov 2015 12:06:31 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0EA112073F for ; Wed, 11 Nov 2015 12:06:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B5F212075F for ; Wed, 11 Nov 2015 12:06:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752935AbbKKMC3 (ORCPT ); Wed, 11 Nov 2015 07:02:29 -0500 Received: from mail-pa0-f46.google.com ([209.85.220.46]:33559 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752928AbbKKMC0 (ORCPT ); Wed, 11 Nov 2015 07:02:26 -0500 Received: by pabfh17 with SMTP id fh17so29737240pab.0 for ; Wed, 11 Nov 2015 04:02:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=avagotech.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N/4JzINSAjNHeQ8iL/N5Jf7fxKB+/1pbIOt/9dvtfmw=; b=DayR9AnBTgSxrlv0Bbt1N1lOeANQWDeDt8ibdbQhxRqMedcP1tZGLarALUkzglLEVp 3wUE9UouoLnWu8SVsleD+X0LHlbbeScO/yN5LvQt7pxvYJek3xqD7bgYWD52fhAiVY9O LsxIzjLbx3cuvBYFf07c0t8yL4Qtl/XCbD4Yk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N/4JzINSAjNHeQ8iL/N5Jf7fxKB+/1pbIOt/9dvtfmw=; b=ch/TdhhYbglnMWRtzKSnAP/sTDdfucu8rt+ugV9FbKIhMlqlFh7omqc+J9JkKxD4j5 7Th28fBcNgLJ+qasPDNcDE0PUnthHFE7Dd55kBB+rNKTlTPE9yJn96v5E10jMqueLQMf fAn9fvJ542aoQapH6rzxIb1AgdyoXnSFy7DeFaCJ9vE6w8e5nUvbhRWQi3oGxxmppl1v IFqQQfXe0P2n85PPkI41JQ3rlGkgrBHTpZkettw/ca35KQQEkEqKYsB7IgPg3ldVxDq+ lGlc28sYmHJ6Uv80wmcZz4o8lE3Z6Ye89bwvofrHCHirXSEFH/UOvbaUGbOqWDF6DnCX UgAQ== X-Gm-Message-State: ALoCoQm1JeAofnDkskoa45e/yVE5nxVIYyiJ7HGdI5fXo3RuRKtWkfE31LVmTcD7hJ3GSl7msYHy X-Received: by 10.66.235.40 with SMTP id uj8mr13350151pac.151.1447243345103; Wed, 11 Nov 2015 04:02:25 -0800 (PST) Received: from host1.lsi.com ([192.19.239.250]) by smtp.gmail.com with ESMTPSA id zi1sm9180851pbc.10.2015.11.11.04.02.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 11 Nov 2015 04:02:24 -0800 (PST) From: Sreekanth Reddy X-Google-Original-From: Sreekanth Reddy To: jejb@kernel.org Cc: martin.petersen@oracle.com, linux-scsi@vger.kernel.org, JBottomley@Parallels.com, Sathya.Prakash@avagotech.com, kashyap.desai@avagotech.com, linux-kernel@vger.kernel.org, hch@infradead.org, chaitra.basappa@avagotech.com, suganath-prabu.subramani@avagotech.com, Sreekanth Reddy , Sreekanth Reddy Subject: [PATCH RESEND 15/25] mpt3sas: Refcount fw_events and fix unsafe list usage Date: Wed, 11 Nov 2015 17:30:31 +0530 Message-Id: <1447243241-10912-16-git-send-email-Sreekanth.Reddy@avagotech.com> X-Mailer: git-send-email 2.0.2 In-Reply-To: <1447243241-10912-1-git-send-email-Sreekanth.Reddy@avagotech.com> References: <1447243241-10912-1-git-send-email-Sreekanth.Reddy@avagotech.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sreekanth Reddy The fw_event_work struct is concurrently referenced at shutdown, so add a refcount to protect it, and refactor the code to use it. Additionally, refactor _scsih_fw_event_cleanup_queue() such that it no longer iterates over the list without holding the lock, since _firmware_event_work() concurrently deletes items from the list. This patch is ported from below mpt2sas driver patch, these changes also required for mpt3sas driver and we tested this on mpt3sas driver also, 'commit 008549f6e8a1dc4aeea4a8d64184909786b27713 ("mpt2sas: Refcount fw_events and fix unsafe list usage")' Signed-off-by: Sreekanth Reddy --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 116 ++++++++++++++++++++++++++++------- 1 file changed, 94 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 5dbf214..436e65e 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -200,9 +200,37 @@ struct fw_event_work { u8 VP_ID; u8 ignore; u16 event; + struct kref refcount; char event_data[0] __aligned(4); }; +static void fw_event_work_free(struct kref *r) +{ + kfree(container_of(r, struct fw_event_work, refcount)); +} + +static void fw_event_work_get(struct fw_event_work *fw_work) +{ + kref_get(&fw_work->refcount); +} + +static void fw_event_work_put(struct fw_event_work *fw_work) +{ + kref_put(&fw_work->refcount, fw_event_work_free); +} + +static struct fw_event_work *alloc_fw_event_work(int len) +{ + struct fw_event_work *fw_event; + + fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC); + if (!fw_event) + return NULL; + + kref_init(&fw_event->refcount); + return fw_event; +} + /** * struct _scsi_io_transfer - scsi io transfer * @handle: sas device handle (assigned by firmware) @@ -2598,32 +2626,36 @@ _scsih_fw_event_add(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) return; spin_lock_irqsave(&ioc->fw_event_lock, flags); + fw_event_work_get(fw_event); INIT_LIST_HEAD(&fw_event->list); list_add_tail(&fw_event->list, &ioc->fw_event_list); INIT_WORK(&fw_event->work, _firmware_event_work); + fw_event_work_get(fw_event); queue_work(ioc->firmware_event_thread, &fw_event->work); spin_unlock_irqrestore(&ioc->fw_event_lock, flags); } /** - * _scsih_fw_event_free - delete fw_event + * _scsih_fw_event_del_from_list - delete fw_event from the list * @ioc: per adapter object * @fw_event: object describing the event * Context: This function will acquire ioc->fw_event_lock. * - * This removes firmware event object from link list, frees associated memory. + * If the fw_event is on the fw_event_list, remove it and do a put. * * Return nothing. */ static void -_scsih_fw_event_free(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work +_scsih_fw_event_del_from_list(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) { unsigned long flags; spin_lock_irqsave(&ioc->fw_event_lock, flags); - list_del(&fw_event->list); - kfree(fw_event); + if (!list_empty(&fw_event->list)) { + list_del_init(&fw_event->list); + fw_event_work_put(fw_event); + } spin_unlock_irqrestore(&ioc->fw_event_lock, flags); } @@ -2640,17 +2672,19 @@ mpt3sas_send_trigger_data_event(struct MPT3SAS_ADAPTER *ioc, struct SL_WH_TRIGGERS_EVENT_DATA_T *event_data) { struct fw_event_work *fw_event; + u16 sz; if (ioc->is_driver_loading) return; - fw_event = kzalloc(sizeof(*fw_event) + sizeof(*event_data), - GFP_ATOMIC); + sz = sizeof(*event_data); + fw_event = alloc_fw_event_work(sz); if (!fw_event) return; fw_event->event = MPT3SAS_PROCESS_TRIGGER_DIAG; fw_event->ioc = ioc; memcpy(fw_event->event_data, event_data, sizeof(*event_data)); _scsih_fw_event_add(ioc, fw_event); + fw_event_work_put(fw_event); } /** @@ -2666,12 +2700,13 @@ _scsih_error_recovery_delete_devices(struct MPT3SAS_ADAPTER *ioc) if (ioc->is_driver_loading) return; - fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC); + fw_event = alloc_fw_event_work(0); if (!fw_event) return; fw_event->event = MPT3SAS_REMOVE_UNRESPONDING_DEVICES; fw_event->ioc = ioc; _scsih_fw_event_add(ioc, fw_event); + fw_event_work_put(fw_event); } /** @@ -2685,12 +2720,29 @@ mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc) { struct fw_event_work *fw_event; - fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC); + fw_event = alloc_fw_event_work(0); if (!fw_event) return; fw_event->event = MPT3SAS_PORT_ENABLE_COMPLETE; fw_event->ioc = ioc; _scsih_fw_event_add(ioc, fw_event); + fw_event_work_put(fw_event); +} + +static struct fw_event_work *dequeue_next_fw_event(struct MPT3SAS_ADAPTER *ioc) +{ + unsigned long flags; + struct fw_event_work *fw_event = NULL; + + spin_lock_irqsave(&ioc->fw_event_lock, flags); + if (!list_empty(&ioc->fw_event_list)) { + fw_event = list_first_entry(&ioc->fw_event_list, + struct fw_event_work, list); + list_del_init(&fw_event->list); + } + spin_unlock_irqrestore(&ioc->fw_event_lock, flags); + + return fw_event; } /** @@ -2705,17 +2757,25 @@ mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc) static void _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc) { - struct fw_event_work *fw_event, *next; + struct fw_event_work *fw_event; if (list_empty(&ioc->fw_event_list) || !ioc->firmware_event_thread || in_interrupt()) return; - list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) { - if (cancel_delayed_work_sync(&fw_event->delayed_work)) { - _scsih_fw_event_free(ioc, fw_event); - continue; - } + while ((fw_event = dequeue_next_fw_event(ioc))) { + /* + * Wait on the fw_event to complete. If this returns 1, then + * the event was never executed, and we need a put for the + * reference the delayed_work had on the fw_event. + * + * If it did execute, we wait for it to finish, and the put will + * happen from _firmware_event_work() + */ + if (cancel_delayed_work_sync(&fw_event->delayed_work)) + fw_event_work_put(fw_event); + + fw_event_work_put(fw_event); } } @@ -4242,13 +4302,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT3SAS_ADAPTER *ioc, u16 handle) { struct fw_event_work *fw_event; - fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC); + fw_event = alloc_fw_event_work(0); if (!fw_event) return; fw_event->event = MPT3SAS_TURN_ON_PFA_LED; fw_event->device_handle = handle; fw_event->ioc = ioc; _scsih_fw_event_add(ioc, fw_event); + fw_event_work_put(fw_event); } /** @@ -7498,10 +7559,11 @@ mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase) static void _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) { + _scsih_fw_event_del_from_list(ioc, fw_event); + /* the queue is being flushed so ignore this event */ - if (ioc->remove_host || - ioc->pci_error_recovery) { - _scsih_fw_event_free(ioc, fw_event); + if (ioc->remove_host || ioc->pci_error_recovery) { + fw_event_work_put(fw_event); return; } @@ -7512,8 +7574,16 @@ _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) fw_event->event_data); break; case MPT3SAS_REMOVE_UNRESPONDING_DEVICES: - while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery) + while (scsi_host_in_recovery(ioc->shost) || + ioc->shost_recovery) { + /* + * If we're unloading, bail. Otherwise, this can become + * an infinite loop. + */ + if (ioc->remove_host) + goto out; ssleep(1); + } _scsih_remove_unresponding_sas_devices(ioc); _scsih_scan_for_devices_after_reset(ioc); break; @@ -7558,7 +7628,8 @@ _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event) _scsih_sas_ir_operation_status_event(ioc, fw_event); break; } - _scsih_fw_event_free(ioc, fw_event); +out: + fw_event_work_put(fw_event); } /** @@ -7720,7 +7791,7 @@ mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, } sz = le16_to_cpu(mpi_reply->EventDataLength) * 4; - fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC); + fw_event = alloc_fw_event_work(sz); if (!fw_event) { pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n", ioc->name, __FILE__, __LINE__, __func__); @@ -7733,6 +7804,7 @@ mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, fw_event->VP_ID = mpi_reply->VP_ID; fw_event->event = event; _scsih_fw_event_add(ioc, fw_event); + fw_event_work_put(fw_event); return 1; }