Message ID | b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI/sysfs: Protect driver's D3cold preference from user space | expand |
Hi Lukas, On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > struct pci_dev contains two flags which govern whether the device may > suspend to D3cold: > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > to not wake from D3cold) > > * d3cold_allowed provides an opt-out for user space (default is true, > user space may set to false) > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), > the user space setting overwrites the driver setting. Essentially user > space is trusted to know better than the driver whether D3cold is > working. > > That feels unsafe and wrong. Assume that the change was introduced > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > modified. Instead, consider d3cold_allowed in addition to no_d3cold > when choosing a suspend state for the device. > > That way, user space may opt out of D3cold if the driver hasn't, but it > may no longer force an opt in if the driver has opted out. Makes sense. I just wonder should the sysfs write fail from userspace perspective if the driver has opted out and userspace tries to force it? Or it does that already?
On 9/18/2023 08:07, Mika Westerberg wrote: > Hi Lukas, > > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: >> struct pci_dev contains two flags which govern whether the device may >> suspend to D3cold: >> >> * no_d3cold provides an opt-out for drivers (e.g. if a device is known >> to not wake from D3cold) >> >> * d3cold_allowed provides an opt-out for user space (default is true, >> user space may set to false) >> >> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), >> the user space setting overwrites the driver setting. Essentially user >> space is trusted to know better than the driver whether D3cold is >> working. >> >> That feels unsafe and wrong. Assume that the change was introduced >> inadvertently and do not overwrite no_d3cold when d3cold_allowed is >> modified. Instead, consider d3cold_allowed in addition to no_d3cold >> when choosing a suspend state for the device. >> >> That way, user space may opt out of D3cold if the driver hasn't, but it >> may no longer force an opt in if the driver has opted out. > > Makes sense. I just wonder should the sysfs write fail from userspace > perspective if the driver has opted out and userspace tries to force it? > Or it does that already? What's the history behind why userspace is allowed to opt a device out of D3cold in the first place? It feels like it should have been a debugging only thing to me.
On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote: > On 9/18/2023 08:07, Mika Westerberg wrote: > > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > > > struct pci_dev contains two flags which govern whether the device may > > > suspend to D3cold: > > > > > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > > > to not wake from D3cold) > > > > > > * d3cold_allowed provides an opt-out for user space (default is true, > > > user space may set to false) > > > > > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), > > > the user space setting overwrites the driver setting. Essentially user > > > space is trusted to know better than the driver whether D3cold is > > > working. > > > > > > That feels unsafe and wrong. Assume that the change was introduced > > > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > > > modified. Instead, consider d3cold_allowed in addition to no_d3cold > > > when choosing a suspend state for the device. > > > > > > That way, user space may opt out of D3cold if the driver hasn't, but it > > > may no longer force an opt in if the driver has opted out. > > > > Makes sense. I just wonder should the sysfs write fail from userspace > > perspective if the driver has opted out and userspace tries to force it? > > Or it does that already? > > What's the history behind why userspace is allowed to opt a device out of > D3cold in the first place? > > It feels like it should have been a debugging only thing to me. That's a fair question. Apparently the default for d3cold_allowed was originally "false" and user space could opt in to D3cold. Then commit 4f9c1397e2e8 ("PCI/PM: Enable D3/D3cold by default for most devices") changed the default to "true". That was 11 years ago. I agree that today this should all work automatically and a user space option to disable D3cold on a per-device basis only really makes sense as a debugging aid, hence belongs in debugfs. Thanks, Lukas
On 9/18/2023 08:24, Lukas Wunner wrote: > On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote: >> On 9/18/2023 08:07, Mika Westerberg wrote: >>> On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: >>>> struct pci_dev contains two flags which govern whether the device may >>>> suspend to D3cold: >>>> >>>> * no_d3cold provides an opt-out for drivers (e.g. if a device is known >>>> to not wake from D3cold) >>>> >>>> * d3cold_allowed provides an opt-out for user space (default is true, >>>> user space may set to false) >>>> >>>> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), >>>> the user space setting overwrites the driver setting. Essentially user >>>> space is trusted to know better than the driver whether D3cold is >>>> working. >>>> >>>> That feels unsafe and wrong. Assume that the change was introduced >>>> inadvertently and do not overwrite no_d3cold when d3cold_allowed is >>>> modified. Instead, consider d3cold_allowed in addition to no_d3cold >>>> when choosing a suspend state for the device. >>>> >>>> That way, user space may opt out of D3cold if the driver hasn't, but it >>>> may no longer force an opt in if the driver has opted out. >>> >>> Makes sense. I just wonder should the sysfs write fail from userspace >>> perspective if the driver has opted out and userspace tries to force it? >>> Or it does that already? >> >> What's the history behind why userspace is allowed to opt a device out of >> D3cold in the first place? >> >> It feels like it should have been a debugging only thing to me. > > That's a fair question. > > Apparently the default for d3cold_allowed was originally "false" > and user space could opt in to D3cold. Then commit 4f9c1397e2e8 > ("PCI/PM: Enable D3/D3cold by default for most devices") changed > the default to "true". That was 11 years ago. > > I agree that today this should all work automatically and a > user space option to disable D3cold on a per-device basis only > really makes sense as a debugging aid, hence belongs in debugfs. > Thanks. Then perhaps as part of moving it to debugfs it makes sense to simplify the logic. IE also drop the d3cold_allowed member from struct pci_dev and instead make the debugfs item reflect the "no_d3cold" member.
On Mon, Sep 18, 2023 at 08:28:51AM -0500, Mario Limonciello wrote: > On 9/18/2023 08:24, Lukas Wunner wrote: > > On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote: > > > What's the history behind why userspace is allowed to opt a device out of > > > D3cold in the first place? > > > > > > It feels like it should have been a debugging only thing to me. > > > > That's a fair question. > > > > Apparently the default for d3cold_allowed was originally "false" > > and user space could opt in to D3cold. Then commit 4f9c1397e2e8 > > ("PCI/PM: Enable D3/D3cold by default for most devices") changed > > the default to "true". That was 11 years ago. > > > > I agree that today this should all work automatically and a > > user space option to disable D3cold on a per-device basis only > > really makes sense as a debugging aid, hence belongs in debugfs. > > > > Thanks. Then perhaps as part of moving it to debugfs it makes sense to > simplify the logic. d3cold_allowed is documented in Documentation/ABI/testing/sysfs-bus-pci, so it's user space ABI which we're not allowed to break. We'd have to declare it deprecated, emit a warning when it's used and slowly phase it out over the years. Until then, the $SUBJECT_PATCH probably still makes sense to reinstate the behavior we had until 2016... Thanks, Lukas
On 9/18/2023 09:26, Lukas Wunner wrote: > On Mon, Sep 18, 2023 at 08:28:51AM -0500, Mario Limonciello wrote: >> On 9/18/2023 08:24, Lukas Wunner wrote: >>> On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote: >>>> What's the history behind why userspace is allowed to opt a device out of >>>> D3cold in the first place? >>>> >>>> It feels like it should have been a debugging only thing to me. >>> >>> That's a fair question. >>> >>> Apparently the default for d3cold_allowed was originally "false" >>> and user space could opt in to D3cold. Then commit 4f9c1397e2e8 >>> ("PCI/PM: Enable D3/D3cold by default for most devices") changed >>> the default to "true". That was 11 years ago. >>> >>> I agree that today this should all work automatically and a >>> user space option to disable D3cold on a per-device basis only >>> really makes sense as a debugging aid, hence belongs in debugfs. >>> >> >> Thanks. Then perhaps as part of moving it to debugfs it makes sense to >> simplify the logic. > > d3cold_allowed is documented in Documentation/ABI/testing/sysfs-bus-pci, > so it's user space ABI which we're not allowed to break. We'd have to > declare it deprecated, emit a warning when it's used and slowly phase > it out over the years. What about as part of deprecating it to make it always return -EINVAL no matter the input? The ABI isn't broken, but it allows for the optimization. > > Until then, the $SUBJECT_PATCH probably still makes sense to reinstate > the behavior we had until 2016... > > Thanks, > > Lukas
On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > struct pci_dev contains two flags which govern whether the device may > suspend to D3cold: > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > to not wake from D3cold) > > * d3cold_allowed provides an opt-out for user space (default is true, > user space may set to false) > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), > the user space setting overwrites the driver setting. Essentially user > space is trusted to know better than the driver whether D3cold is > working. > > That feels unsafe and wrong. Assume that the change was introduced > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > modified. Instead, consider d3cold_allowed in addition to no_d3cold > when choosing a suspend state for the device. > > That way, user space may opt out of D3cold if the driver hasn't, but it > may no longer force an opt in if the driver has opted out. > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Cc: stable@vger.kernel.org # v4.8+ > Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Mika and Mario, you both commented on this, but I *think* you were both OK with it as-is for now? If so, can you give a Reviewed-by? I don't want to go ahead if you have any concerns. Since we're touching this area, is there anything we can do to clarify the comments? unsigned int no_d3cold:1; /* D3cold is forbidden */ unsigned int bridge_d3:1; /* Allow D3 for bridge */ unsigned int d3cold_allowed:1; /* D3cold is allowed by user */ These all have to do with D3, but: - For no_d3cold, "D3cold is forbidden" doesn't capture the notion that "driver knows the device doesn't wake from D3cold" - It's not clear whether "bridge_d3" applies to D3hot, D3cold, or both. - The computation of "bridge_d3" is ... complicated. It depends on all the device/platform behavior assumptions in pci_bridge_d3_possible(). And *also* (I think) the user-space "d3cold_allowed" knob, via pci_dev_check_d3cold() in pci_bridge_d3_update(). So it's kind of a mix of hardware characteristics and administrative controls. Any comment updates could be a separate patch so as not to complicate this one. > @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) > { > int acpi_state, d_max; > > - if (pdev->no_d3cold) > + if (pdev->no_d3cold || !pdev->d3cold_allowed) Haha, looks good. Too bad the senses are opposite ("no_d3cold" and "!d3cold_allowed"), but that's for another day. > d_max = ACPI_STATE_D3_HOT; > else > d_max = ACPI_STATE_D3_COLD; > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index ba38fc4..e18ccd2 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -530,10 +530,7 @@ static ssize_t d3cold_allowed_store(struct device *dev, > return -EINVAL; > > pdev->d3cold_allowed = !!val; > - if (pdev->d3cold_allowed) > - pci_d3cold_enable(pdev); > - else > - pci_d3cold_disable(pdev); > + pci_bridge_d3_update(pdev); This is great. D3 is still a tangle, but this is a significant improvement. > pm_runtime_resume(dev); Bjorn
On Thu, Sep 28, 2023 at 05:36:30PM -0500, Bjorn Helgaas wrote: > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > > struct pci_dev contains two flags which govern whether the device may > > suspend to D3cold: > > > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > > to not wake from D3cold) > > > > * d3cold_allowed provides an opt-out for user space (default is true, > > user space may set to false) > > > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), > > the user space setting overwrites the driver setting. Essentially user > > space is trusted to know better than the driver whether D3cold is > > working. > > > > That feels unsafe and wrong. Assume that the change was introduced > > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > > modified. Instead, consider d3cold_allowed in addition to no_d3cold > > when choosing a suspend state for the device. > > > > That way, user space may opt out of D3cold if the driver hasn't, but it > > may no longer force an opt in if the driver has opted out. > > > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > Cc: stable@vger.kernel.org # v4.8+ > > Cc: Mika Westerberg <mika.westerberg@linux.intel.com> > > Mika and Mario, you both commented on this, but I *think* you were > both OK with it as-is for now? If so, can you give a Reviewed-by? > I don't want to go ahead if you have any concerns. No concerns from me, Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[Public] > On Thu, Sep 28, 2023 at 05:36:30PM -0500, Bjorn Helgaas wrote: > > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > > > struct pci_dev contains two flags which govern whether the device may > > > suspend to D3cold: > > > > > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > > > to not wake from D3cold) > > > > > > * d3cold_allowed provides an opt-out for user space (default is true, > > > user space may set to false) > > > > > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during > suspend"), > > > the user space setting overwrites the driver setting. Essentially user > > > space is trusted to know better than the driver whether D3cold is > > > working. > > > > > > That feels unsafe and wrong. Assume that the change was introduced > > > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > > > modified. Instead, consider d3cold_allowed in addition to no_d3cold > > > when choosing a suspend state for the device. > > > > > > That way, user space may opt out of D3cold if the driver hasn't, but it > > > may no longer force an opt in if the driver has opted out. > > > > > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > Cc: stable@vger.kernel.org # v4.8+ > > > Cc: Mika Westerberg <mika.westerberg@linux.intel.com> > > > > Mika and Mario, you both commented on this, but I *think* you were > > both OK with it as-is for now? If so, can you give a Reviewed-by? > > I don't want to go ahead if you have any concerns. > > No concerns from me, > > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> It is a step in the right direction. I do think a later enhancement to fail the write from sysfs (to be ABI compatible but "deprecate it") and internal optimization as a result is a good idea though. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote: > struct pci_dev contains two flags which govern whether the device may > suspend to D3cold: > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known > to not wake from D3cold) > > * d3cold_allowed provides an opt-out for user space (default is true, > user space may set to false) > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), > the user space setting overwrites the driver setting. Essentially user > space is trusted to know better than the driver whether D3cold is > working. > > That feels unsafe and wrong. Assume that the change was introduced > inadvertently and do not overwrite no_d3cold when d3cold_allowed is > modified. Instead, consider d3cold_allowed in addition to no_d3cold > when choosing a suspend state for the device. > > That way, user space may opt out of D3cold if the driver hasn't, but it > may no longer force an opt in if the driver has opted out. > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Cc: stable@vger.kernel.org # v4.8+ > Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Applied with Reviewed-by from Mika and Mario to pci/pm for v6.7, thanks! > --- > drivers/pci/pci-acpi.c | 2 +- > drivers/pci/pci-sysfs.c | 5 +---- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c > index 7aa1c20..2f5eddf 100644 > --- a/drivers/pci/pci-acpi.c > +++ b/drivers/pci/pci-acpi.c > @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) > { > int acpi_state, d_max; > > - if (pdev->no_d3cold) > + if (pdev->no_d3cold || !pdev->d3cold_allowed) > d_max = ACPI_STATE_D3_HOT; > else > d_max = ACPI_STATE_D3_COLD; > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index ba38fc4..e18ccd2 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -530,10 +530,7 @@ static ssize_t d3cold_allowed_store(struct device *dev, > return -EINVAL; > > pdev->d3cold_allowed = !!val; > - if (pdev->d3cold_allowed) > - pci_d3cold_enable(pdev); > - else > - pci_d3cold_disable(pdev); > + pci_bridge_d3_update(pdev); > > pm_runtime_resume(dev); > > -- > 2.39.2 >
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c index 7aa1c20..2f5eddf 100644 --- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct pci_dev *pdev) { int acpi_state, d_max; - if (pdev->no_d3cold) + if (pdev->no_d3cold || !pdev->d3cold_allowed) d_max = ACPI_STATE_D3_HOT; else d_max = ACPI_STATE_D3_COLD; diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index ba38fc4..e18ccd2 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -530,10 +530,7 @@ static ssize_t d3cold_allowed_store(struct device *dev, return -EINVAL; pdev->d3cold_allowed = !!val; - if (pdev->d3cold_allowed) - pci_d3cold_enable(pdev); - else - pci_d3cold_disable(pdev); + pci_bridge_d3_update(pdev); pm_runtime_resume(dev);
struct pci_dev contains two flags which govern whether the device may suspend to D3cold: * no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold) * d3cold_allowed provides an opt-out for user space (default is true, user space may set to false) Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working. That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device. That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out. Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.8+ Cc: Mika Westerberg <mika.westerberg@linux.intel.com> --- drivers/pci/pci-acpi.c | 2 +- drivers/pci/pci-sysfs.c | 5 +---- 2 files changed, 2 insertions(+), 5 deletions(-)