diff mbox series

[PATCHv2,1/2] pci: provide bus reset attribute

Message ID 20241025222755.3756162-1-kbusch@meta.com (mailing list archive)
State Accepted
Commit 2fa046449a82a7d0f6d9721dd83e348816038444
Delegated to: Bjorn Helgaas
Headers show
Series [PATCHv2,1/2] pci: provide bus reset attribute | expand

Commit Message

Keith Busch Oct. 25, 2024, 10:27 p.m. UTC
From: Keith Busch <kbusch@kernel.org>

Resetting a bus from an end device only works if it's the only function
on or below that bus.

Provide an attribute on the pci_dev bridge device that can perform the
secondary bus reset. This makes it possible for a user to safely reset
multiple devices in a single command using the secondary bus reset
action.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
v1->v2:

  Moved the attribute from the pci_bus to the bridge's pci_dev

  And renamed it to "reset_subordinate" to distinguish from other
  existing device "reset" attributes.

  Added documentation.

  Follow up patch to warn if the action was potentially harmful.

 Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
 drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
 drivers/pci/pci.c                       |  2 +-
 drivers/pci/pci.h                       |  1 +
 4 files changed, 36 insertions(+), 1 deletion(-)

Comments

Alex Williamson Oct. 28, 2024, 7:35 p.m. UTC | #1
On Fri, 25 Oct 2024 15:27:54 -0700
Keith Busch <kbusch@meta.com> wrote:

> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> v1->v2:
> 
>   Moved the attribute from the pci_bus to the bridge's pci_dev
> 
>   And renamed it to "reset_subordinate" to distinguish from other
>   existing device "reset" attributes.
> 
>   Added documentation.
> 
>   Follow up patch to warn if the action was potentially harmful.
> 
>  Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
>  drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
>  drivers/pci/pci.c                       |  2 +-
>  drivers/pci/pci.h                       |  1 +
>  4 files changed, 36 insertions(+), 1 deletion(-)

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
ameynarkhede03 Oct. 29, 2024, 3:05 p.m. UTC | #2
On 24/10/25 03:27pm, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
>
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
Keith Busch Nov. 4, 2024, 9:28 p.m. UTC | #3
On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.

Hi Bjorn, just want to check in. Do you have concerns remaining for this
feature?
Krzysztof Wilczyński Nov. 4, 2024, 9:53 p.m. UTC | #4
Hello,

> > Resetting a bus from an end device only works if it's the only function
> > on or below that bus.
> > 
> > Provide an attribute on the pci_dev bridge device that can perform the
> > secondary bus reset. This makes it possible for a user to safely reset
> > multiple devices in a single command using the secondary bus reset
> > action.
> 
> Hi Bjorn, just want to check in. Do you have concerns remaining for this
> feature? 

Would you have anything against if we put this new bus reset sysfs object
access behind the following test?

  if (!capable(CAP_SYS_ADMIN))
  	return -EPERM;

This is irregardless of what the permissions on the sysfs objects from the
DAC point of view are set to.

Checking CAP_SYS_ADMIN capability, to improve our default security stance,
on a number of important sysfs objects (e.g., reset, remove, etc.) we have
was something I discussed in the past with Bjorn, but never got around to
sending a patch to add this check.

Thoughts?

	Krzysztof
Keith Busch Nov. 4, 2024, 9:58 p.m. UTC | #5
On Tue, Nov 05, 2024 at 06:53:32AM +0900, Krzysztof Wilczy´nski wrote:
> Would you have anything against if we put this new bus reset sysfs object
> access behind the following test?
> 
>   if (!capable(CAP_SYS_ADMIN))
>   	return -EPERM;
> 
> This is irregardless of what the permissions on the sysfs objects from the
> DAC point of view are set to.
> 
> Checking CAP_SYS_ADMIN capability, to improve our default security stance,
> on a number of important sysfs objects (e.g., reset, remove, etc.) we have
> was something I discussed in the past with Bjorn, but never got around to
> sending a patch to add this check.
> 
> Thoughts?

Sure, I'm okay that. We are using DEVICE_ATTR_WO file attribute which
says should make it writable only by an admin, but totally fine with
adding this explicit check here too.
Krzysztof Wilczyński Nov. 4, 2024, 10:42 p.m. UTC | #6
On 24-11-04 14:58:38, Keith Busch wrote:
> On Tue, Nov 05, 2024 at 06:53:32AM +0900, Krzysztof Wilczy´nski wrote:
> > Would you have anything against if we put this new bus reset sysfs object
> > access behind the following test?
> > 
> >   if (!capable(CAP_SYS_ADMIN))
> >   	return -EPERM;
> > 
> > This is irregardless of what the permissions on the sysfs objects from the
> > DAC point of view are set to.
> > 
> > Checking CAP_SYS_ADMIN capability, to improve our default security stance,
> > on a number of important sysfs objects (e.g., reset, remove, etc.) we have
> > was something I discussed in the past with Bjorn, but never got around to
> > sending a patch to add this check.
> > 
> > Thoughts?
> 
> Sure, I'm okay that. We are using DEVICE_ATTR_WO file attribute which
> says should make it writable only by an admin, but totally fine with
> adding this explicit check here too.

Thank you!

Depending on whether Bjorn will have any feedback to might prompt a new
version of the patches to be sent, if there won't be one, then I will add
this extra check directly on the branch.

	Krzysztof
Keith Busch Nov. 12, 2024, 7:12 p.m. UTC | #7
On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.

Hi Bjorn, are we okay with this one? I am trying to get some tooling and
processes in place that rely on this reset capability, but I don't want
to get ahead of myself if the kernel side needs to go a different
direction.
Bjorn Helgaas Nov. 12, 2024, 11:16 p.m. UTC | #8
On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.

Can we clarify this wording?  "Resetting a bus from an end device"?

I guess this has something to do with pci_parent_bus_reset(dev), which
declines to call pci_bridge_secondary_bus_reset() if there are any
other devices on the same bus as "dev"? 

pci_parent_bus_reset() is only called from pci_reset_bus_function(),
which is used by the "bus" and "cxl_bus" reset methods.

So I guess the point is something like:

  The "bus" and "cxl_bus" reset methods reset a device by asserting
  Secondary Bus Reset on the bridge leading to the device.
  pci_parent_bus_reset() only allows that if the device is the only
  device below the bridge.

  Add a sysfs attribute on bridges that can assert Secondary Bus Reset
  regardless of how many devices are below the bridge.  This makes it
  possible for users to reset multiple devices in a single command.

I omitted "safely" because this doesn't do any checking to ensure
safety, so I don't know in what sense it is safe.

It seems like this is partly just a convenience, to reset several
devices at once.

But I think it is *also* a new way to reset devices that we might not
be able to reset otherwise, e.g., if there's more than one device on
the bus, and they don't support ACPI, FLR, or PM reset, there
previously was no reset method that worked at all.  Right?

> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> v1->v2:
> 
>   Moved the attribute from the pci_bus to the bridge's pci_dev
> 
>   And renamed it to "reset_subordinate" to distinguish from other
>   existing device "reset" attributes.
> 
>   Added documentation.
> 
>   Follow up patch to warn if the action was potentially harmful.
> 
>  Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
>  drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
>  drivers/pci/pci.c                       |  2 +-
>  drivers/pci/pci.h                       |  1 +
>  4 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 7f63c7e977735..5da6a14dc326b 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -163,6 +163,17 @@ Description:
>  		will be present in sysfs.  Writing 1 to this file
>  		will perform reset.
>  
> +What:		/sys/bus/pci/devices/.../reset_subordinate
> +Date:		October 2024
> +Contact:	linux-pci@vger.kernel.org
> +Description:
> +		This is visible only for bridge devices. If you want to reset
> +		all devices attached through the subordinate bus of a specific
> +		bridge device, writing 1 to this will try to do it.  This will
> +		affect all devices attached to the system through this bridge
> +		similiar to writing 1 to their individual "reset" file, so use
> +		with caution.
> +
>  What:		/sys/bus/pci/devices/.../vpd
>  Date:		February 2008
>  Contact:	Ben Hutchings <bwh@kernel.org>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 5d0f4db1cab78..480a99e50612b 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -521,6 +521,28 @@ static ssize_t bus_rescan_store(struct device *dev,
>  static struct device_attribute dev_attr_bus_rescan = __ATTR(rescan, 0200, NULL,
>  							    bus_rescan_store);
>  
> +static ssize_t reset_subordinate_store(struct device *dev,
> +				struct device_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct pci_bus *bus = pdev->subordinate;
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val) < 0)
> +		return -EINVAL;
> +
> +	if (val) {
> +		int ret = __pci_reset_bus(bus);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return count;
> +}
> +static DEVICE_ATTR_WO(reset_subordinate);
> +
>  #if defined(CONFIG_PM) && defined(CONFIG_ACPI)
>  static ssize_t d3cold_allowed_store(struct device *dev,
>  				    struct device_attribute *attr,
> @@ -625,6 +647,7 @@ static struct attribute *pci_dev_attrs[] = {
>  static struct attribute *pci_bridge_attrs[] = {
>  	&dev_attr_subordinate_bus_number.attr,
>  	&dev_attr_secondary_bus_number.attr,
> +	&dev_attr_reset_subordinate.attr,
>  	NULL,
>  };
>  
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 7d85c04fbba2a..338dfcd983f1e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5880,7 +5880,7 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_bus);
>   *
>   * Same as above except return -EAGAIN if the bus cannot be locked
>   */
> -static int __pci_reset_bus(struct pci_bus *bus)
> +int __pci_reset_bus(struct pci_bus *bus)
>  {
>  	int rc;
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 14d00ce45bfa9..1cdc2c9547a7e 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -104,6 +104,7 @@ bool pci_reset_supported(struct pci_dev *dev);
>  void pci_init_reset_methods(struct pci_dev *dev);
>  int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
>  int pci_bus_error_reset(struct pci_dev *dev);
> +int __pci_reset_bus(struct pci_bus *bus);
>  
>  struct pci_cap_saved_data {
>  	u16		cap_nr;
> -- 
> 2.43.5
>
Keith Busch Nov. 13, 2024, 5:38 p.m. UTC | #9
On Tue, Nov 12, 2024 at 05:16:23PM -0600, Bjorn Helgaas wrote:
> On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> > 
> > Resetting a bus from an end device only works if it's the only function
> > on or below that bus.
> 
> Can we clarify this wording?  "Resetting a bus from an end device"?

What I mean is, if you find a device you want to reset in the sysfs
hierarchy, and find its reset method:

  /sys/bus/pci/devices/<D:B:D.f>/reset_method

If you request "bus", it only works if it is the only function on that bus.

I agree my commit message there is a bit unclear, though. I was a bit to
deep in that code when I wrote it.
 
> I guess this has something to do with pci_parent_bus_reset(dev), which
> declines to call pci_bridge_secondary_bus_reset() if there are any
> other devices on the same bus as "dev"? 

Yep, exactly.

> pci_parent_bus_reset() is only called from pci_reset_bus_function(),
> which is used by the "bus" and "cxl_bus" reset methods.
> 
> So I guess the point is something like:
> 
>   The "bus" and "cxl_bus" reset methods reset a device by asserting
>   Secondary Bus Reset on the bridge leading to the device.
>   pci_parent_bus_reset() only allows that if the device is the only
>   device below the bridge.
> 
>   Add a sysfs attribute on bridges that can assert Secondary Bus Reset
>   regardless of how many devices are below the bridge.  This makes it
>   possible for users to reset multiple devices in a single command.
> 
> I omitted "safely" because this doesn't do any checking to ensure
> safety, so I don't know in what sense it is safe.

By "safe", what I mean is the device is locked by the kernel, config
space is saved and restored on either side of the reset, and the
attached driver (if any) is notified this action is about to happen and
when it completes so it do whatever quiescent and restoring actions it
needs for a bus reset.

I can't say this is universally "safe" since it's a bit optomistic to
assume everything affected by this action is going to work as the pci
driver expects, but the alternatives (setpci) don't coordinate anything
with the kernel, so it's "safe" relative to that :)

> It seems like this is partly just a convenience, to reset several
> devices at once.
>
> But I think it is *also* a new way to reset devices that we might not
> be able to reset otherwise, e.g., if there's more than one device on
> the bus, and they don't support ACPI, FLR, or PM reset, there
> previously was no reset method that worked at all.  Right?

Exactly! I have multi-function devices in a switch hierarchy where this
unfortunately is really the only way to do it. They don't support
resetting individual functions, so SBR is the only way we have to
reliably reset the device.
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 7f63c7e977735..5da6a14dc326b 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -163,6 +163,17 @@  Description:
 		will be present in sysfs.  Writing 1 to this file
 		will perform reset.
 
+What:		/sys/bus/pci/devices/.../reset_subordinate
+Date:		October 2024
+Contact:	linux-pci@vger.kernel.org
+Description:
+		This is visible only for bridge devices. If you want to reset
+		all devices attached through the subordinate bus of a specific
+		bridge device, writing 1 to this will try to do it.  This will
+		affect all devices attached to the system through this bridge
+		similiar to writing 1 to their individual "reset" file, so use
+		with caution.
+
 What:		/sys/bus/pci/devices/.../vpd
 Date:		February 2008
 Contact:	Ben Hutchings <bwh@kernel.org>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 5d0f4db1cab78..480a99e50612b 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -521,6 +521,28 @@  static ssize_t bus_rescan_store(struct device *dev,
 static struct device_attribute dev_attr_bus_rescan = __ATTR(rescan, 0200, NULL,
 							    bus_rescan_store);
 
+static ssize_t reset_subordinate_store(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct pci_bus *bus = pdev->subordinate;
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	if (val) {
+		int ret = __pci_reset_bus(bus);
+
+		if (ret)
+			return ret;
+	}
+
+	return count;
+}
+static DEVICE_ATTR_WO(reset_subordinate);
+
 #if defined(CONFIG_PM) && defined(CONFIG_ACPI)
 static ssize_t d3cold_allowed_store(struct device *dev,
 				    struct device_attribute *attr,
@@ -625,6 +647,7 @@  static struct attribute *pci_dev_attrs[] = {
 static struct attribute *pci_bridge_attrs[] = {
 	&dev_attr_subordinate_bus_number.attr,
 	&dev_attr_secondary_bus_number.attr,
+	&dev_attr_reset_subordinate.attr,
 	NULL,
 };
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7d85c04fbba2a..338dfcd983f1e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5880,7 +5880,7 @@  EXPORT_SYMBOL_GPL(pci_probe_reset_bus);
  *
  * Same as above except return -EAGAIN if the bus cannot be locked
  */
-static int __pci_reset_bus(struct pci_bus *bus)
+int __pci_reset_bus(struct pci_bus *bus)
 {
 	int rc;
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 14d00ce45bfa9..1cdc2c9547a7e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -104,6 +104,7 @@  bool pci_reset_supported(struct pci_dev *dev);
 void pci_init_reset_methods(struct pci_dev *dev);
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
 int pci_bus_error_reset(struct pci_dev *dev);
+int __pci_reset_bus(struct pci_bus *bus);
 
 struct pci_cap_saved_data {
 	u16		cap_nr;