Message ID | 20211115182552.3830849-1-iwona.winiarska@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce PECI subsystem | expand |
On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote: > +void peci_device_destroy(struct peci_device *device) > +{ > + bool killed; > + > + device_lock(&device->dev); > + killed = kill_device(&device->dev); Eeek, why call this? > + device_unlock(&device->dev); > + > + if (!killed) > + return; What happened if something changed after you unlocked it? Why is kill_device() required at all? That's a very rare function to call, and one that only one "bus" calls today because it is very special (i.e. crazy and broken...) thanks, greg k-h
On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote: > On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote: > > +void peci_device_destroy(struct peci_device *device) > > +{ > > + bool killed; > > + > > + device_lock(&device->dev); > > + killed = kill_device(&device->dev); > > Eeek, why call this? > > > + device_unlock(&device->dev); > > + > > + if (!killed) > > + return; > > What happened if something changed after you unlocked it? We either killed it, or the other caller killed it. > > Why is kill_device() required at all? That's a very rare function to > call, and one that only one "bus" calls today because it is very > special (i.e. crazy and broken...) It's used to avoid double-delete in case of races between peci_controller unregister and "manually" removing the device using sysfs (pointed out by Dan in v2). We're calling peci_device_destroy() in both callsites. Other way to solve it would be to just have a peci-specific lock, but kill_device seemed to be well suited for the problem at hand. Do you suggest to remove it and just go with the lock? Thanks -Iwona > > thanks, > > greg k-h
On Mon, Nov 15, 2021 at 10:35:23PM +0000, Winiarska, Iwona wrote: > On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote: > > On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote: > > > +void peci_device_destroy(struct peci_device *device) > > > +{ > > > + bool killed; > > > + > > > + device_lock(&device->dev); > > > + killed = kill_device(&device->dev); > > > > Eeek, why call this? > > > > > + device_unlock(&device->dev); > > > + > > > + if (!killed) > > > + return; > > > > What happened if something changed after you unlocked it? > > We either killed it, or the other caller killed it. > > > > > Why is kill_device() required at all? That's a very rare function to > > call, and one that only one "bus" calls today because it is very > > special (i.e. crazy and broken...) > > It's used to avoid double-delete in case of races between peci_controller > unregister and "manually" removing the device using sysfs (pointed out by Dan in > v2). We're calling peci_device_destroy() in both callsites. > Other way to solve it would be to just have a peci-specific lock, but > kill_device seemed to be well suited for the problem at hand. > Do you suggest to remove it and just go with the lock? Yes please, remove it and use the lock. Also, why are you required to have a sysfs file that can remove the device? Who wants that? thanks, greg k-h
On Mon, Nov 15, 2021 at 10:25:39AM PST, Iwona Winiarska wrote: >Hi, > >This is a third round of patches introducing PECI subsystem. >Sorry for the delay between v2 and v3. > Hi Iwona, I've done some testing of these patches on my AST2500/E-2778G OpenBMC platform -- I had to do a small bit of hacking to add support for INTEL_FAM6_KABYLAKE, but with that in place the newly-added code for the 8.8 format seems to work as it should. Thanks! In poking at it a bit further I encountered some sub-optimal behavior w.r.t. to host power state transitions and timeouts though -- essentially, if I ever hit a timeout in aspeed_peci_xfer() (for example on a read of a hwmon tempX_input file after an unexpected host shutdown), it seems to get stuck in a state where even if the host comes back online, all attempted PECI transfers continue just timing out. (Rebooting the BMC seems to resolve the problem.) This also happens if I remove the peci client device via the 'remove' sysfs file, shut down the host, and then do a rescan via sysfs while the host is off (i.e. another operation that times out). Let me know if there's any other info that would be helpful for debugging. Thanks, Zev
On Tue, 2021-11-16 at 07:26 +0100, gregkh@linuxfoundation.org wrote: > On Mon, Nov 15, 2021 at 10:35:23PM +0000, Winiarska, Iwona wrote: > > On Mon, 2021-11-15 at 19:49 +0100, Greg Kroah-Hartman wrote: > > > On Mon, Nov 15, 2021 at 07:25:45PM +0100, Iwona Winiarska wrote: > > > > +void peci_device_destroy(struct peci_device *device) > > > > +{ > > > > + bool killed; > > > > + > > > > + device_lock(&device->dev); > > > > + killed = kill_device(&device->dev); > > > > > > Eeek, why call this? > > > > > > > + device_unlock(&device->dev); > > > > + > > > > + if (!killed) > > > > + return; > > > > > > What happened if something changed after you unlocked it? > > > > We either killed it, or the other caller killed it. > > > > > > > > Why is kill_device() required at all? That's a very rare function to > > > call, and one that only one "bus" calls today because it is very > > > special (i.e. crazy and broken...) > > > > It's used to avoid double-delete in case of races between peci_controller > > unregister and "manually" removing the device using sysfs (pointed out by Dan > > in > > v2). We're calling peci_device_destroy() in both callsites. > > Other way to solve it would be to just have a peci-specific lock, but > > kill_device seemed to be well suited for the problem at hand. > > Do you suggest to remove it and just go with the lock? > > Yes please, remove it and use the lock. Ack. > > Also, why are you required to have a sysfs file that can remove the > device? Who wants that? From the following patch: "PECI devices may not be discoverable at the time when PECI controller is being added (e.g. BMC can boot up when the Host system is still in S5). Since we currently don't have the capabilities to figure out the Host system state inside the PECI subsystem itself, we have to rely on userspace to do it for us." That's about rescan, but userspace might also want to remove the devices e.g. when Host goes into S5. It's also useful for development and debug purposes (and also allows us to have a nice bit of symmetry with rescan). Thanks -Iwona > > thanks, > > greg k-h
On Wed, 2021-11-17 at 03:56 +0000, Zev Weiss wrote: > On Mon, Nov 15, 2021 at 10:25:39AM PST, Iwona Winiarska wrote: > > Hi, > > > > This is a third round of patches introducing PECI subsystem. > > Sorry for the delay between v2 and v3. > > > > Hi Iwona, > > I've done some testing of these patches on my AST2500/E-2778G OpenBMC > platform -- I had to do a small bit of hacking to add support for > INTEL_FAM6_KABYLAKE, but with that in place the newly-added code for the > 8.8 format seems to work as it should. Thanks! Thanks for the report and testing :) > > In poking at it a bit further I encountered some sub-optimal behavior > w.r.t. to host power state transitions and timeouts though -- > essentially, if I ever hit a timeout in aspeed_peci_xfer() (for example > on a read of a hwmon tempX_input file after an unexpected host > shutdown), it seems to get stuck in a state where even if the host comes > back online, all attempted PECI transfers continue just timing out. > (Rebooting the BMC seems to resolve the problem.) This also happens if > I remove the peci client device via the 'remove' sysfs file, shut down > the host, and then do a rescan via sysfs while the host is off (i.e. > another operation that times out). > > Let me know if there's any other info that would be helpful for > debugging. That's unexpected. I do have an idea what might have caused that. Let me fix it in v4. Thanks -Iwona > > > Thanks, > Zev
On Thu, 2021-11-18 at 14:19 +0200, Tomer Maimon wrote: > Hi Iwona, > > My name is Tomer I working as a SW engineer in Nuvoton BMC project. > > First, thanks for upstreaming the PECI driver! > > Nuvoton (NPCM) PECI driver was in the PECI patchset that has been handheld by > Jae. > https://patchwork.kernel.org/project/linux-arm-kernel/patch/20191211194624.2872-10-jae.hyun.yoo@linux.intel.com/ > > Could you add Nuvoton (NPCM) PECI driver to your patch set next time you will > send upstream patches to Linux vanilla? Some (relatively small) changes are going to be needed to adapt that driver to changes that happened in PECI core. I want to keep this series as small as possible, but once it gets merged, Nuvoton driver can be added in a separate series. > If you agree, we will check your patchset in Nuvoton systems in a few days and > send you NPCM OECI driver and documentation. > I don't have any hardware to test it on so help will definitely be welcome :) Thanks -Iwona > Thanks, > > Tomer > > On Mon, 15 Nov 2021 at 20:28, Iwona Winiarska <iwona.winiarska@intel.com> wrote: > > Hi, > > > > This is a third round of patches introducing PECI subsystem. > > Sorry for the delay between v2 and v3. > > > > The Platform Environment Control Interface (PECI) is a communication > > interface between Intel processors and management controllers (e.g. > > Baseboard Management Controller, BMC). > > > > This series adds a PECI subsystem and introduces drivers which run in > > the Linux instance on the management controller (not the main Intel > > processor) and is intended to be used by the OpenBMC [1], a Linux > > distribution for BMC devices. > > The information exposed over PECI (like processor and DIMM > > temperature) refers to the Intel processor and can be consumed by > > daemons running on the BMC to, for example, display the processor > > temperature in its web interface. > > > > The PECI bus is collection of code that provides interface support > > between PECI devices (that actually represent processors) and PECI > > controllers (such as the "peci-aspeed" controller) that allow to > > access physical PECI interface. PECI devices are bound to PECI > > drivers that provides access to PECI services. This series introduces > > a generic "peci-cpu" driver that exposes hardware monitoring "cputemp" > > and "dimmtemp" using the auxiliary bus. > > > > Exposing "raw" PECI to userspace, either to write userspace drivers or > > for debug/testing purpose was left out of this series to encourage > > writing kernel drivers instead, but may be pursued in the future. > > > > Introducing PECI to upstream Linux was already attempted before [2]. > > Since it's been over a year since last revision, and the series > > changed quite a bit in the meantime, I've decided to start from v1. > > > > I would also like to give credit to everyone who helped me with > > different aspects of preliminary review: > > - Pierre-Louis Bossart, > > - Tony Luck, > > - Andy Shevchenko, > > - Dave Hansen. > > > > [1] https://github.com/openbmc/openbmc > > [2] > > https://lore.kernel.org/openbmc/20191211194624.2872-1-jae.hyun.yoo@linux.intel.com/ > > > > Changes v2 -> v3: > > > > * Dropped x86/cpu patches (Boris) > > * Dropped pr_fmt() for PECI module (Dan) > > * Fixed releasing peci controller device flow (Dan) > > * Improved peci-aspeed commit-msg and Kconfig help (Dan) > > * Fixed aspeed_peci_xfer() to use the proper spin_lock function (Dan) > > * Wrapped print_hex_dump_bytes() in CONFIG_DYNAMIC_DEBUG (Dan) > > * Removed debug status logs from aspeed_peci_irq_handler() (Dan) > > * Renamed functions using devres to start with "devm" (Dan) > > * Changed request to be allocated on stack in peci_detect (Dan) > > * Removed redundant WARN_ON on invalid PECI addr (Dan) > > * Changed peci_device_create() to use device_initialize() + device_add() > > pattern (Dan) > > * Fixed peci_device_destroy() to use kill_device() avoiding double-free (Dan) > > * Renamed functions that perform xfer using "peci_xfer_*" prefix (Dan) > > * Renamed peci_request_data_dib(temp) -> peci_request_dib(temp)_read (Dan) > > * Fixed thermal margin readings for older Intel processors (Zev) > > * Misc hwmon simplifications (Guenter) > > * Used BIT_PER_TYPE to verify macro value constrains (Guenter) > > * Improved WARN_ON message to print chan_rank_max and idx_dimm_max (Guenter) > > * Improved dimmtemp to not reattempt probe if no dimms are populated > > > > Changes v1 -> v2: > > > > Biggest changes when it comes to diffstat are locking in HWMON > > (I decided to clean things up a bit while adding it), switching to > > devres usage in more places and exposing sysfs interface in separate patch. > > > > * Moved extending X86 ARCHITECTURE MAINTAINERS earlier in series (Dan) > > * Removed "default n" for GENERIC_LIB_X86 (Dan) > > * Added vendor prefix for peci-aspeed specific properties (Rob) > > * Refactored PECI to use devres consistently (Dan) > > * Added missing sysfs documentation and excluded adding peci-sysfs to > > separate patch (Dan) > > * Used module_init() instead of subsys_init() for peci module initialization > > (Dan) > > * Removed redundant struct peci_device member (Dan) > > * Improved PECI Kconfig help (Randy/Dan) > > * Fixed/removed log messages (Dan, Guenter) > > * Refactored peci-cputemp and peci-dimmtemp and added missing locks (Guenter) > > * Removed unused dev_set_drvdata() in peci-cputemp and peci-dimmtemp (Guenter) > > * Fixed used types, names, fixed broken and added additional comments > > to peci-hwmon (Guenter, Zev) > > * Refactored peci-dimmtemp to not return -ETIMEDOUT (Guenter) > > * Added sanity check for min_peci_revision in peci-hwmon drivers (Zev) > > * Added assert for DIMM_NUMS_MAX and additional warning in peci-dimmtemp (Zev) > > * Fixed macro names in peci-aspeed (Zev) > > * Refactored peci-aspeed sanitizing properties to a single helper function > > (Zev) > > * Fixed peci_cpu_device_ids definition for Broadwell Xeon D (David) > > * Refactor peci_request to use a single allocation (Zev) > > * Used min_t() to improve code readability (Zev) > > * Added macro for PECI_RDENDPTCFG_MMIO_WR_LEN_BASE and fixed adev type > > array name to more descriptive (Zev) > > * Fixed peci-hwmon commit-msg and documentation (Zev) > > > > Thanks > > -Iwona > > > > Iwona Winiarska (11): > > dt-bindings: Add generic bindings for PECI > > dt-bindings: Add bindings for peci-aspeed > > ARM: dts: aspeed: Add PECI controller nodes > > peci: Add core infrastructure > > peci: Add device detection > > peci: Add sysfs interface for PECI bus > > peci: Add support for PECI device drivers > > peci: Add peci-cpu driver > > hwmon: peci: Add cputemp driver > > hwmon: peci: Add dimmtemp driver > > docs: Add PECI documentation > > > > Jae Hyun Yoo (2): > > peci: Add peci-aspeed controller driver > > docs: hwmon: Document PECI drivers > > > > Documentation/ABI/testing/sysfs-bus-peci | 16 + > > .../devicetree/bindings/peci/peci-aspeed.yaml | 109 +++ > > .../bindings/peci/peci-controller.yaml | 33 + > > Documentation/hwmon/index.rst | 2 + > > Documentation/hwmon/peci-cputemp.rst | 90 +++ > > Documentation/hwmon/peci-dimmtemp.rst | 57 ++ > > Documentation/index.rst | 1 + > > Documentation/peci/index.rst | 16 + > > Documentation/peci/peci.rst | 51 ++ > > MAINTAINERS | 29 + > > arch/arm/boot/dts/aspeed-g4.dtsi | 14 + > > arch/arm/boot/dts/aspeed-g5.dtsi | 14 + > > arch/arm/boot/dts/aspeed-g6.dtsi | 14 + > > drivers/Kconfig | 3 + > > drivers/Makefile | 1 + > > drivers/hwmon/Kconfig | 2 + > > drivers/hwmon/Makefile | 1 + > > drivers/hwmon/peci/Kconfig | 31 + > > drivers/hwmon/peci/Makefile | 7 + > > drivers/hwmon/peci/common.h | 58 ++ > > drivers/hwmon/peci/cputemp.c | 592 ++++++++++++++++ > > drivers/hwmon/peci/dimmtemp.c | 630 ++++++++++++++++++ > > drivers/peci/Kconfig | 36 + > > drivers/peci/Makefile | 10 + > > drivers/peci/controller/Kconfig | 17 + > > drivers/peci/controller/Makefile | 3 + > > drivers/peci/controller/peci-aspeed.c | 429 ++++++++++++ > > drivers/peci/core.c | 236 +++++++ > > drivers/peci/cpu.c | 343 ++++++++++ > > drivers/peci/device.c | 249 +++++++ > > drivers/peci/internal.h | 136 ++++ > > drivers/peci/request.c | 482 ++++++++++++++ > > drivers/peci/sysfs.c | 82 +++ > > include/linux/peci-cpu.h | 40 ++ > > include/linux/peci.h | 110 +++ > > 35 files changed, 3944 insertions(+) > > create mode 100644 Documentation/ABI/testing/sysfs-bus-peci > > create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.yaml > > create mode 100644 Documentation/devicetree/bindings/peci/peci- > > controller.yaml > > create mode 100644 Documentation/hwmon/peci-cputemp.rst > > create mode 100644 Documentation/hwmon/peci-dimmtemp.rst > > create mode 100644 Documentation/peci/index.rst > > create mode 100644 Documentation/peci/peci.rst > > create mode 100644 drivers/hwmon/peci/Kconfig > > create mode 100644 drivers/hwmon/peci/Makefile > > create mode 100644 drivers/hwmon/peci/common.h > > create mode 100644 drivers/hwmon/peci/cputemp.c > > create mode 100644 drivers/hwmon/peci/dimmtemp.c > > create mode 100644 drivers/peci/Kconfig > > create mode 100644 drivers/peci/Makefile > > create mode 100644 drivers/peci/controller/Kconfig > > create mode 100644 drivers/peci/controller/Makefile > > create mode 100644 drivers/peci/controller/peci-aspeed.c > > create mode 100644 drivers/peci/core.c > > create mode 100644 drivers/peci/cpu.c > > create mode 100644 drivers/peci/device.c > > create mode 100644 drivers/peci/internal.h > > create mode 100644 drivers/peci/request.c > > create mode 100644 drivers/peci/sysfs.c > > create mode 100644 include/linux/peci-cpu.h > > create mode 100644 include/linux/peci.h > >