diff mbox

[0/1] pyverbs: fix speed_to_str(), to handle disabled links

Message ID 20191221013256.100409-1-jhubbard@nvidia.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

John Hubbard Dec. 21, 2019, 1:32 a.m. UTC
Hi,

This came up when I was running rdma-core tests on a two-machine setup,
where each card had two ports, but there was only one cable. So only
one port on each end was connected.

The main thing I expect to be up for debate is, what string to return
for speed, when a port is disabled or down? I initially thought about
returning  '(Disabled/down)', but it seems more accurate to just report
'0.0 Gbps', so that's what I settled on.

Background: here's what I wrote when discussing this over on linux-mm
with Leon [1]:

It looks like this test suite assumes that every link is connected!
(Probably in most test systems, they are.) But in my setup, the ConnectX
cards each have two slots, and I only have (and only need) one cable. So
one link is up, and the other is disabled.

This leads to the other problem, which is that if a link is disabled,
the test suite finds a "0" token for attr.active_speed. That token is
not in the approved list, and so d.speed_to_str() asserts.

With some diagnostics added, I can see it checking each link: one
passes, and the other asserts:


         assert attr.max_msg_sz > 0x1000

...and the test run from that is:

# ./build/bin/run_tests.py --verbose tests.test_device.DeviceTest
test_dev_list (tests.test_device.DeviceTest) ... ok
test_open_dev (tests.test_device.DeviceTest) ... ok
test_query_device (tests.test_device.DeviceTest) ... ok
test_query_device_ex (tests.test_device.DeviceTest) ... ok
test_query_gid (tests.test_device.DeviceTest) ... ok
test_query_port (tests.test_device.DeviceTest) ...
Diagnostics ===========================================
phys_state:     Link up (5)
active_width):  4X (2)
active_speed:   25.0 Gbps (32)
END of Diagnostics ====================================

Diagnostics ===========================================
phys_state:     Disabled (3)
active_width):  4X (2)
active_speed:   Invalid speed
END of Diagnostics ====================================
FAIL
test_query_port_bad_flow (tests.test_device.DeviceTest) ... ok

======================================================================
FAIL: test_query_port (tests.test_device.DeviceTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/kernel_work/rdma-core/tests/test_device.py", line 135, in test_query_port
    self.verify_port_attr(port_attr)
  File "/kernel_work/rdma-core/tests/test_device.py", line 119, in verify_port_attr
    assert 'Invalid' not in d.speed_to_str(attr.active_speed)
AssertionError

----------------------------------------------------------------------
Ran 7 tests in 0.055s

FAILED (failures=1)

[1] https://lore.kernel.org/r/b70ac328-2dc0-efe3-05c2-3e040b662256@nvidia.com


John Hubbard (1):
  pyverbs: fix speed_to_str(), to handle disabled links

 pyverbs/device.pyx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Leon Romanovsky Dec. 21, 2019, 10:03 a.m. UTC | #1
On Fri, Dec 20, 2019 at 05:32:55PM -0800, John Hubbard wrote:
> Hi,
>
> This came up when I was running rdma-core tests on a two-machine setup,
> where each card had two ports, but there was only one cable. So only
> one port on each end was connected.
>
> The main thing I expect to be up for debate is, what string to return
> for speed, when a port is disabled or down? I initially thought about
> returning  '(Disabled/down)', but it seems more accurate to just report
> '0.0 Gbps', so that's what I settled on.
>
> Background: here's what I wrote when discussing this over on linux-mm
> with Leon [1]:
>
> It looks like this test suite assumes that every link is connected!
> (Probably in most test systems, they are.)

I don't remember whenever the expectation of connection is by design or outcome
of mine and Jason's setups, where our cards are being connected in loopback mode
(port 0 to port 1 of the same card).

The loopback mode simplifies our kernel testing and development.

Thanks
diff mbox

Patch

diff --git a/tests/test_device.py b/tests/test_device.py
index 524e0e89..7b33d7db 100644
--- a/tests/test_device.py
+++ b/tests/test_device.py
@@ -110,6 +110,12 @@  class DeviceTest(unittest.TestCase):
         assert 'Invalid' not in d.translate_mtu(attr.max_mtu)
         assert 'Invalid' not in d.translate_mtu(attr.active_mtu)
         assert 'Invalid' not in d.width_to_str(attr.active_width)
+        print("")
+        print('Diagnostics ===========================================')
+        print('phys_state:    ', d.phys_state_to_str(attr.phys_state))
+        print('active_width): ', d.width_to_str(attr.active_width))
+        print('active_speed:  ',   d.speed_to_str(attr.active_speed))
+        print('END of Diagnostics ====================================')
         assert 'Invalid' not in d.speed_to_str(attr.active_speed)
         assert 'Invalid' not in d.translate_link_layer(attr.link_layer)
         assert attr.max_msg_sz > 0x1000