mbox series

[RFC,0/1] A regression in cxl_endpoint_gather_bandwidth()

Message ID VI1PR10MB2016FF2DAF7BFB4E5907F419CE232@VI1PR10MB2016.EURPRD10.PROD.OUTLOOK.COM
Headers show
Series A regression in cxl_endpoint_gather_bandwidth() | expand

Message

Li Ming Nov. 22, 2024, 7:20 a.m. UTC
My cxl-test environment always hits a BUG when I tried to run cxl
test based on the top of tag cxl-for-6.13.
base-commit: a83383e2ae7c499ff7b318945d9b2fe4e3006c2c 

The calltrace is
 BUG: KASAN: slab-out-of-bounds in pcie_capability_read_word+0x1df/0x220

 Call Trace:
  <TASK>
  dump_stack_lvl+0x82/0xd0
  print_report+0xcb/0x5d0
  kasan_report+0xbd/0xf0
  pcie_capability_read_word+0x1df/0x220
  pcie_link_speed_mbps+0x6a/0x130
  cxl_pci_get_bandwidth+0x68/0x1c0 [cxl_core]
  cxl_endpoint_gather_bandwidth.constprop.0+0x352/0x780 [cxl_core]
  cxl_region_shared_upstream_bandwidth_update+0x257/0x1640 [cxl_core]
  cxl_region_attach+0x1025/0x1e80 [cxl_core]
  cxl_add_to_region+0x121/0x14c0 [cxl_core]
  discover_region+0xa5/0x150 [cxl_port]

The root cause is that cxl_endpoint_gather_bandwidth() assumes all devices
are PCI devices, but mock devices are created by cxl-test with platform
devices.

At the beginning, I tried to implement a __wrap_cxl_pci_get_bandwidth()
function on cxl_mock side so that the function can fake a bandwidth for
platform devices. But I met below error during compiling.

depmod: ERROR: Cycle detected: cxl_mock -> cxl_core -> cxl_mock
depmod: ERROR: Found 2 modules in dependency cycles!

My understanding is that the error happened because
cxl_pci_get_bandwidth() is implemented in cxl_core module and it is
invoked inside cxl_core module, so that I cannot implement a wrap
function for that in cxl_mock module.

So current fixup is using dev_is_platform() to check if the device is a
platform device inside cxl_pci_get_bandwidth(), if yes, faking a maximum
bandwidth for the platform device.
Another option is that cxl_endpoint_gather_bandwidth() only works for
PCI devices, adding a dev_is_pci() to check the device type inside
cxl_endpoint_gather_bandwidth().

Li Ming (1):
  cxl/pci: Fake bandwidth for platform devices

 drivers/cxl/core/cdat.c |  5 ++---
 drivers/cxl/core/core.h |  2 +-
 drivers/cxl/core/pci.c  | 24 +++++++++++++++++-------
 3 files changed, 20 insertions(+), 11 deletions(-)