From patchwork Fri Oct 21 12:14:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prarit Bhargava X-Patchwork-Id: 9388781 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 31988607D0 for ; Fri, 21 Oct 2016 12:14:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 22D542A165 for ; Fri, 21 Oct 2016 12:14:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 175902A167; Fri, 21 Oct 2016 12:14:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9489E2A165 for ; Fri, 21 Oct 2016 12:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932553AbcJUMOv (ORCPT ); Fri, 21 Oct 2016 08:14:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58450 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932488AbcJUMOu (ORCPT ); Fri, 21 Oct 2016 08:14:50 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 69F4B6196F; Fri, 21 Oct 2016 12:14:50 +0000 (UTC) Received: from praritdesktop.bos.redhat.com (prarit-guest.khw.lab.eng.bos.redhat.com [10.16.186.145]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9LCEnjA020057; Fri, 21 Oct 2016 08:14:49 -0400 From: Prarit Bhargava To: linux-pci@vger.kernel.org Cc: Prarit Bhargava , Alex Williamson , David Arcari , Myron Stowe , Bjorn Helgaas Subject: [RFE PATCH] pci: Do not enable intx on MSI-capable devices on shutdown Date: Fri, 21 Oct 2016 08:14:43 -0400 Message-Id: <1477052083-13815-1-git-send-email-prarit@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 21 Oct 2016 12:14:50 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The following unhandled IRQ warning is seen during shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1 Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016 0000000000000000 ffff88041f803e70 ffffffff81333bd5 ffff88041cb78200 ffff88041cb7829c ffff88041f803e98 ffffffff810d9465 ffff88041cb78200 0000000000000000 0000000000000028 ffff88041f803ed0 ffffffff810d97bf Call Trace: [] dump_stack+0x63/0x8e [] __report_bad_irq+0x35/0xd0 [] note_interrupt+0x20f/0x260 [] handle_irq_event_percpu+0x45/0x60 [] handle_irq_event+0x2c/0x50 [] handle_fasteoi_irq+0x8a/0x150 [] handle_irq+0xab/0x130 [] ? _local_bh_enable+0x21/0x50 [] do_IRQ+0x4d/0xd0 [] common_interrupt+0x82/0x82 [] ? cpuidle_enter_state+0xc1/0x280 [] ? cpuidle_enter_state+0xb4/0x280 [] cpuidle_enter+0x17/0x20 [] cpu_startup_entry+0x220/0x3a0 [] rest_init+0x77/0x80 [] start_kernel+0x495/0x4a2 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x13d/0x14c pci_device_shutdown() is called on each PCI device, and does if (drv && drv->shutdown) drv->shutdown(pci_dev); pci_msi_shutdown(pci_dev); pci_msix_shutdown(pci_dev); The pci_msi_shutdown() and pci_msix_shutdown() functions both call pci_intx_for_msi() which enables the intx interrupt independent of the driver. The driver still thinks it is using MSI/X and the result is the above stack trace. We have seen this at Red Hat on various drivers: nouveau, ahci, and pcieport (so far). Google search for "unhandled irq 16" yields many results reporting similar behavior during shutdown indicating that this problem is widespread. I can cause this to happen on a "stable" system by adding a 3 second delay in pci_device_shutdown() which causes the number of spurious interrupts to exceed the 100000 limit and display the warning above. Also note that by adding the 3 second delay, NVIDIA devices with device ID 0x0FF* hit this problem 100% of the time. darcari noticed that removing the pci_intx_for_msi() call resulted in a stable system. After further discussions with Myron and Alex, Alex came up idea of keeping the intx disabled during shutdown implemented below. ----8<---- The following unhandled IRQ warning is seen during shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1 Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/01/2016 0000000000000000 ffff88041f803e70 ffffffff81333bd5 ffff88041cb78200 ffff88041cb7829c ffff88041f803e98 ffffffff810d9465 ffff88041cb78200 0000000000000000 0000000000000028 ffff88041f803ed0 ffffffff810d97bf Call Trace: [] dump_stack+0x63/0x8e [] __report_bad_irq+0x35/0xd0 [] note_interrupt+0x20f/0x260 [] handle_irq_event_percpu+0x45/0x60 [] handle_irq_event+0x2c/0x50 [] handle_fasteoi_irq+0x8a/0x150 [] handle_irq+0xab/0x130 [] ? _local_bh_enable+0x21/0x50 [] do_IRQ+0x4d/0xd0 [] common_interrupt+0x82/0x82 [] ? cpuidle_enter_state+0xc1/0x280 [] ? cpuidle_enter_state+0xb4/0x280 [] cpuidle_enter+0x17/0x20 [] cpu_startup_entry+0x220/0x3a0 [] rest_init+0x77/0x80 [] start_kernel+0x495/0x4a2 [] ? set_init_arg+0x55/0x55 [] ? early_idt_handler_array+0x120/0x120 [] x86_64_start_reservations+0x2a/0x2c [] x86_64_start_kernel+0x13d/0x14c This occurs because the pci_msi_shutdown() and pci_msix_shutdown() functions enable the legacy intx interrupt even though the device and driver were not configured for legacy intx. This patch blocks the enabling of intx during system shutdown or reboot. Signed-off-by: Prarit Bhargava Cc: Alex Williamson Cc: David Arcari Cc: Myron Stowe Cc: Bjorn Helgaas --- drivers/pci/msi.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index bfdd0744b686..915cc29797f9 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -910,7 +910,8 @@ void pci_msi_shutdown(struct pci_dev *dev) desc = first_pci_msi_entry(dev); pci_msi_set_enable(dev, 0); - pci_intx_for_msi(dev, 1); + if (system_state == SYSTEM_RUNNING || system_state == SYSTEM_BOOTING) + pci_intx_for_msi(dev, 1); dev->msi_enabled = 0; /* Return the device with MSI unmasked as initial states */ @@ -1024,7 +1025,8 @@ void pci_msix_shutdown(struct pci_dev *dev) } pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); - pci_intx_for_msi(dev, 1); + if (system_state == SYSTEM_RUNNING || system_state == SYSTEM_BOOTING) + pci_intx_for_msi(dev, 1); dev->msix_enabled = 0; pcibios_alloc_irq(dev); }