From patchwork Fri Apr 14 08:22:49 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Wunner X-Patchwork-Id: 9680795 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8C26460132 for ; Fri, 14 Apr 2017 08:23:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E12927BA5 for ; Fri, 14 Apr 2017 08:23:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72A9A2861A; Fri, 14 Apr 2017 08:23:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2829227BA5 for ; Fri, 14 Apr 2017 08:23:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751666AbdDNIWr (ORCPT ); Fri, 14 Apr 2017 04:22:47 -0400 Received: from mailout1.hostsharing.net ([83.223.95.204]:38953 "EHLO mailout1.hostsharing.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752028AbdDNIWp (ORCPT ); Fri, 14 Apr 2017 04:22:45 -0400 Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mailout1.hostsharing.net (Postfix) with ESMTPS id 4EF80100A3B20; Fri, 14 Apr 2017 10:22:11 +0200 (CEST) Received: from localhost (5-38-90-81.adsl.cmo.de [81.90.38.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by h08.hostsharing.net (Postfix) with ESMTPSA id 6FF2F60E7F83; Fri, 14 Apr 2017 10:22:41 +0200 (CEST) Date: Fri, 14 Apr 2017 10:22:49 +0200 From: Lukas Wunner To: "Rafael J. Wysocki" Cc: Geert Uytterhoeven , Bjorn Helgaas , Yinghai Lu , Mika Westerberg , Laurent Pinchart , Simon Horman , linux-pci , Linux PM list , Linux-Renesas , "linux-kernel@vger.kernel.org" Subject: Re: PCI / PM: Crashes in PME scan during system suspend Message-ID: <20170414082249.GA5417@wunner.de> References: <2661070.8D7d40DjM3@aspire.rjw.lan> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <2661070.8D7d40DjM3@aspire.rjw.lan> User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Feb 14, 2017 at 12:26:01PM +0100, Rafael J. Wysocki wrote: > On Tuesday, February 14, 2017 10:31:38 AM Geert Uytterhoeven wrote: > > Laurent Pinchart reported that r8a7790/Lager crashes during suspend tests. > > > > I managed to reproduce the issue on r8a7791/koelsch: > > - It only happens during suspend tests, after writing either "platform" > > or "processors" to /sys/power/pm_test, > > - It does not (or is less likely) to happen during full system suspend > > ("core" or "none"). > > > > More investigation shows this happens when the PME scan runs, once per > > second. During PME scan, the PCI host bridge (rcar-pci) registers are > > accessed while the host bridge's module clock has already been disabled, > > leading to a crash. > > OK, so clearly PME scans should be suspended before the host bridge > registers become inaccessible. > > Another question, though, is whether or not PME scans are actually necessary > on the affected platforms at all. I'm not seeing a fix for this in linux-next, am I missing something? Has anyone looked into it or is the issue still open? Below is a tentative patch which moves PME polling to a freezable workqueue, so it is frozen before the host bridge is suspended. Geert, Laurent, could you test this? The patch may be problematic in that pci_pme_list_scan() acquires pci_pme_list_mutex, which is also acquired by pci_pme_active(), which gets called when devices are suspended -- *after* the worker has been frozen. I'm not really familiar with the freezer, can it happen that the worker is frozen while holding the mutex? If so this would deadlock. Rafael? Alternative approaches would be to (a) skip devices in pci_pme_list_scan() if their is_prepared or is_suspended flags are set, or (b) disable PME polling via a PM notifier. The latter seems preferable performance-wise. (To avoid checking these flags once per second.) Best regards, Lukas -- >8 -- diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 7904d02..d35c016 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1782,8 +1782,8 @@ static void pci_pme_list_scan(struct work_struct *work) } } if (!list_empty(&pci_pme_list)) - schedule_delayed_work(&pci_pme_work, - msecs_to_jiffies(PME_TIMEOUT)); + queue_delayed_work(system_freezable_wq, &pci_pme_work, + msecs_to_jiffies(PME_TIMEOUT)); mutex_unlock(&pci_pme_list_mutex); } @@ -1848,8 +1848,9 @@ void pci_pme_active(struct pci_dev *dev, bool enable) mutex_lock(&pci_pme_list_mutex); list_add(&pme_dev->list, &pci_pme_list); if (list_is_singular(&pci_pme_list)) - schedule_delayed_work(&pci_pme_work, - msecs_to_jiffies(PME_TIMEOUT)); + queue_delayed_work(system_freezable_wq, + &pci_pme_work, + msecs_to_jiffies(PME_TIMEOUT)); mutex_unlock(&pci_pme_list_mutex); } else { mutex_lock(&pci_pme_list_mutex);