From patchwork Fri Jan 12 13:10:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Rafael J. Wysocki" X-Patchwork-Id: 10160811 X-Patchwork-Delegate: rjw@sisk.pl Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D4AA66029B for ; Fri, 12 Jan 2018 13:14:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4A61289E6 for ; Fri, 12 Jan 2018 13:14:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B9902289F1; Fri, 12 Jan 2018 13:14:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30295289E6 for ; Fri, 12 Jan 2018 13:14:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933529AbeALNNj (ORCPT ); Fri, 12 Jan 2018 08:13:39 -0500 Received: from cloudserver094114.home.pl ([79.96.170.134]:49880 "EHLO cloudserver094114.home.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933149AbeALNNi (ORCPT ); Fri, 12 Jan 2018 08:13:38 -0500 Received: from 79.184.254.76.ipv4.supernova.orange.pl (79.184.254.76) (HELO aspire.rjw.lan) by serwer1319399.home.pl (79.96.170.134) with SMTP (IdeaSmtpServer 0.82) id dcbb724cfccfb59e; Fri, 12 Jan 2018 14:13:36 +0100 From: "Rafael J. Wysocki" To: Linux PM Cc: Geert Uytterhoeven , Ulf Hansson , LKML Subject: [PATCH 1/2] PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume() Date: Fri, 12 Jan 2018 14:10:38 +0100 Message-ID: <1709653.TNnOthRhLp@aspire.rjw.lan> In-Reply-To: <4928977.5hd79RbQlp@aspire.rjw.lan> References: <4928977.5hd79RbQlp@aspire.rjw.lan> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Rafael J. Wysocki There are problems with calling pm_runtime_force_suspend/resume() to "stop" and "start" devices in genpd_finish_suspend() and genpd_resume_noirq() (and in analogous hibernation-specific genpd callbacks) after commit 122a22377a3d (PM / Domains: Stop/start devices during system PM suspend/resume in genpd) as those routines do much more than just "stopping" and "starting" devices (which was the stated purpose of that commit) unnecessarily and may not play well with system-wide PM driver callbacks. First, consider the pm_runtime_force_suspend() in genpd_finish_suspend(). If the current runtime PM status of the device is "suspended", that function most likely does the right thing by ignoring the device, because it should have been "stopped" already and whatever needed to be done to deactivate it shoud have been done. In turn, if the runtime PM status of the device is "active", genpd_runtime_suspend() is called for it (indirectly) and (1) runs the ->runtime_suspend callback provided by the device's driver (assuming no bus type with ->runtime_suspend of its own), (2) "stops" the device and (3) checks if the domain can be powered down, and then (4) the device's runtime PM status is changed to "suspended". Out of the four actions above (1) is not necessary and it may be outright harmful, (3) is pointless and (4) is questionable. The only operation that needs to be carried out here is (2). The reason why (1) is not necessary is because the system-wide PM callbacks provided by the device driver for the transition in question have been run and they should have taken care of the driver's part of device suspend already. Moreover, it may be harmful, because the ->runtime_suspend callback may want to access the device which is partially suspended at that point and may not be responsive. Also, system-wide PM callbacks may have been run already (in the previous phases of the system transition under way) for the device's parent or for its supplier devices (if any) and the device may not be accessible because of that. There also is no reason to do (3), because genpd_finish_suspend() will repeat it anyway, and (4) potentially causes confusion to ensue during the subsequent system transition to the working state. Consider pm_runtime_force_resume() in genpd_resume_noirq() now. It runs genpd_runtime_resume() for all devices with runtime PM status set to "suspended", which includes all of the devices whose runtime PM status was changed by pm_runtime_force_suspend() before and may include some devices already suspended when the pm_runtime_force_suspend() was running, which may be confusing. The genpd_runtime_resume() first tries to power up the domain, which (again) is pointless, because genpd_resume_noirq() has done that already. Then, it "starts" the device and runs the ->runtime_resume callback (from the driver, say) for it. If all is well, the device is left with the runtime PM status set to "active". Unfortunately, running the driver's ->runtime_resume callback before its system-wide PM callbacks and possibly before some system-wide PM callbacks of the parent device's driver (let alone supplier drivers) is asking for trouble, especially if the device had been suspended before pm_runtime_force_suspend() ran previously or if the callbacks in question expect to be run back-to-back with their suspend-side counterparts. It also should not be necessary, because the system-wide PM driver callbacks that will be invoked for the device subsequently should take care of resuming it just fine. [Running the driver's ->runtime_resume callback in the "noirq" phase of the transition to the working state may be problematic even for devices whose drivers do use pm_runtime_force_resume() in (or as) their system-wide PM callbacks if they have suppliers other than their parents, because it may cause the supplier to be resumed after the consumer in some cases.] Because of the above, modify genpd as follows: 1. Change genpd_finish_suspend() to only "stop" devices with runtime PM status set to "active" (without invoking runtime PM callbacks for them, changing their runtime PM status and so on). That doesn't change the handling of devices whose drivers use pm_runtime_force_suspend/resume() in (or as) their system-wide PM callbacks and addresses the issues described above for the other devices. 2. Change genpd_resume_noirq() to only "start" devices with runtime PM status set to "active" (without invoking runtime PM callbacks for them, changing their runtime PM status and so on). Again, that doesn't change the handling of devices whose drivers use pm_runtime_force_suspend/resume() in (or as) their system-wide PM callbacks and addresses the described issues for the other devices. Devices with runtime PM status set to "suspended" are not started with the assumption that they will be resumed later, either by pm_runtime_force_resume() or via runtime PM. 3. Change genpd_restore_noirq() to follow genpd_resume_noirq(). That causes devices already suspended before hibernation to be left alone (which also is the case without the change) and avoids running the ->runtime_resume driver callback too early for the other devices. 4. Change genpd_freeze_noirq() and genpd_thaw_noirq() in accordance with the above modifications. Fixes: 122a22377a3d (PM / Domains: Stop/start devices during system PM suspend/resume in genpd) Signed-off-by: Rafael J. Wysocki Acked-by: Ulf Hansson --- drivers/base/power/domain.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) Index: linux-pm/drivers/base/power/domain.c =================================================================== --- linux-pm.orig/drivers/base/power/domain.c +++ linux-pm/drivers/base/power/domain.c @@ -1048,8 +1048,9 @@ static int genpd_finish_suspend(struct d if (dev->power.wakeup_path && genpd_is_active_wakeup(genpd)) return 0; - if (genpd->dev_ops.stop && genpd->dev_ops.start) { - ret = pm_runtime_force_suspend(dev); + if (genpd->dev_ops.stop && genpd->dev_ops.start && + !pm_runtime_status_suspended(dev)) { + ret = genpd_stop_dev(genpd, dev); if (ret) { if (poweroff) pm_generic_restore_noirq(dev); @@ -1106,8 +1107,9 @@ static int genpd_resume_noirq(struct dev genpd->suspended_count--; genpd_unlock(genpd); - if (genpd->dev_ops.stop && genpd->dev_ops.start) { - ret = pm_runtime_force_resume(dev); + if (genpd->dev_ops.stop && genpd->dev_ops.start && + !pm_runtime_status_suspended(dev)) { + ret = genpd_start_dev(genpd, dev); if (ret) return ret; } @@ -1139,8 +1141,9 @@ static int genpd_freeze_noirq(struct dev if (ret) return ret; - if (genpd->dev_ops.stop && genpd->dev_ops.start) - ret = pm_runtime_force_suspend(dev); + if (genpd->dev_ops.stop && genpd->dev_ops.start && + !pm_runtime_status_suspended(dev)) + ret = genpd_stop_dev(genpd, dev); return ret; } @@ -1163,8 +1166,9 @@ static int genpd_thaw_noirq(struct devic if (IS_ERR(genpd)) return -EINVAL; - if (genpd->dev_ops.stop && genpd->dev_ops.start) { - ret = pm_runtime_force_resume(dev); + if (genpd->dev_ops.stop && genpd->dev_ops.start && + !pm_runtime_status_suspended(dev)) { + ret = genpd_start_dev(genpd, dev); if (ret) return ret; } @@ -1221,8 +1225,9 @@ static int genpd_restore_noirq(struct de genpd_sync_power_on(genpd, true, 0); genpd_unlock(genpd); - if (genpd->dev_ops.stop && genpd->dev_ops.start) { - ret = pm_runtime_force_resume(dev); + if (genpd->dev_ops.stop && genpd->dev_ops.start && + !pm_runtime_status_suspended(dev)) { + ret = genpd_start_dev(genpd, dev); if (ret) return ret; }