diff mbox

[rdma-core] kernel-boot: modules: Load ib_umad for RoCE hardware

Message ID 792b6447-1efb-a977-5d9b-22b4351c5bcb@suse.de (mailing list archive)
State Changes Requested
Delegated to: Leon Romanovsky
Headers show

Commit Message

Nicolas Morey-Chaisemartin April 4, 2018, 8:26 a.m. UTC
ib_umad is required to get ibstat working.
Auto-load it for RoCE hardware so it works out of the box

$ ibstat
ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded?
ibpanic: [4638] main: can't init UMAD library: No such file or directory
$ modprobe ib_umad
$ ibstat
CA 'bnxt_re0'
	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
	Number of ports: 1
	Firmware version: 20.8.29.0
	Hardware version: 0x14e4
	Node GUID: 0x9edc71fffeb69930
	System image GUID: 0x9edc71fffeb69930
	Port 1:
		State: Down
		Physical state: Disabled
		Rate: 100
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x041d0000
		Port GUID: 0x9edc71fffeb69930
		Link layer: Ethernet

Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
---
 kernel-boot/modules/roce.conf | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jason Gunthorpe April 4, 2018, 10:21 p.m. UTC | #1
On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote:
> ib_umad is required to get ibstat working.
> Auto-load it for RoCE hardware so it works out of the box
> 
> $ ibstat
> ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded?
> ibpanic: [4638] main: can't init UMAD library: No such file or directory
> $ modprobe ib_umad
> $ ibstat
> CA 'bnxt_re0'
> 	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
> 	Number of ports: 1
> 	Firmware version: 20.8.29.0
> 	Hardware version: 0x14e4
> 	Node GUID: 0x9edc71fffeb69930
> 	System image GUID: 0x9edc71fffeb69930
> 	Port 1:
> 		State: Down
> 		Physical state: Disabled
> 		Rate: 100
> 		Base lid: 0
> 		LMC: 0
> 		SM lid: 0
> 		Capability mask: 0x041d0000
> 		Port GUID: 0x9edc71fffeb69930
> 		Link layer: Ethernet
> 
> Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
> ---
>  kernel-boot/modules/roce.conf | 3 +++
>  1 file changed, 3 insertions(+)

Is this just a userspace bug in ibstat or do roce ports actually implement umad?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Morey-Chaisemartin April 5, 2018, 6:35 a.m. UTC | #2
On 04/05/2018 12:21 AM, Jason Gunthorpe wrote:
> On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote:
>> ib_umad is required to get ibstat working.
>> Auto-load it for RoCE hardware so it works out of the box
>>
>> $ ibstat
>> ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded?
>> ibpanic: [4638] main: can't init UMAD library: No such file or directory
>> $ modprobe ib_umad
>> $ ibstat
>> CA 'bnxt_re0'
>> 	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
>> 	Number of ports: 1
>> 	Firmware version: 20.8.29.0
>> 	Hardware version: 0x14e4
>> 	Node GUID: 0x9edc71fffeb69930
>> 	System image GUID: 0x9edc71fffeb69930
>> 	Port 1:
>> 		State: Down
>> 		Physical state: Disabled
>> 		Rate: 100
>> 		Base lid: 0
>> 		LMC: 0
>> 		SM lid: 0
>> 		Capability mask: 0x041d0000
>> 		Port GUID: 0x9edc71fffeb69930
>> 		Link layer: Ethernet
>>
>> Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
>> ---
>>  kernel-boot/modules/roce.conf | 3 +++
>>  1 file changed, 3 insertions(+)
> Is this just a userspace bug in ibstat or do roce ports actually implement umad?
>
> Jason

It seems like they don't.
It's just ibstat doing a umad_init() before making few calls to umad_get_cas_names() and a few other that seem to work through simple reads in sysfs (not umad related).

Not sure what the clean way to fix this is. Removing the call to umad_init feels dirty but it's simple enough.
Any ideas ?

It might be worth updating the man pages to flag which umad function do not actually require the umad module and are safe to call without calling umad_init() first.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 5, 2018, 3:20 p.m. UTC | #3
On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote:
> 
> 
> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote:
> > On Wed, Apr 04, 2018 at 10:26:47AM +0200, Nicolas Morey-Chaisemartin wrote:
> >> ib_umad is required to get ibstat working.
> >> Auto-load it for RoCE hardware so it works out of the box
> >>
> >> $ ibstat
> >> ibwarn: [4638] umad_init: can't read ABI version from /sys/class/infiniband_mad/abi_version (No such file or directory): is ib_umad module loaded?
> >> ibpanic: [4638] main: can't init UMAD library: No such file or directory
> >> $ modprobe ib_umad
> >> $ ibstat
> >> CA 'bnxt_re0'
> >> 	CA type: Broadcom NetXtreme-C/E RoCE Driver HCA
> >> 	Number of ports: 1
> >> 	Firmware version: 20.8.29.0
> >> 	Hardware version: 0x14e4
> >> 	Node GUID: 0x9edc71fffeb69930
> >> 	System image GUID: 0x9edc71fffeb69930
> >> 	Port 1:
> >> 		State: Down
> >> 		Physical state: Disabled
> >> 		Rate: 100
> >> 		Base lid: 0
> >> 		LMC: 0
> >> 		SM lid: 0
> >> 		Capability mask: 0x041d0000
> >> 		Port GUID: 0x9edc71fffeb69930
> >> 		Link layer: Ethernet
> >>
> >> Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
> >>  kernel-boot/modules/roce.conf | 3 +++
> >>  1 file changed, 3 insertions(+)
> > Is this just a userspace bug in ibstat or do roce ports actually implement umad?
> >
> > Jason
> 
> It seems like they don't.  It's just ibstat doing a umad_init()
> before making few calls to umad_get_cas_names() and a few other that
> seem to work through simple reads in sysfs (not umad related).
> 
> Not sure what the clean way to fix this is. Removing the call to
> umad_init feels dirty but it's simple enough.  Any ideas ?

Maybe we should make umad_init not open the umad device if the link
layer is ethernet, iwarp, etc?

Hal?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Morey-Chaisemartin April 5, 2018, 3:40 p.m. UTC | #4
On 04/05/2018 05:20 PM, Jason Gunthorpe wrote:
> On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote:
>>
>> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote:
>>>
>>> Is this just a userspace bug in ibstat or do roce ports actually implement umad?
>>>
>>> Jason
>> It seems like they don't.  It's just ibstat doing a umad_init()
>> before making few calls to umad_get_cas_names() and a few other that
>> seem to work through simple reads in sysfs (not umad related).
>>
>> Not sure what the clean way to fix this is. Removing the call to
>> umad_init feels dirty but it's simple enough.  Any ideas ?
> Maybe we should make umad_init not open the umad device if the link
> layer is ethernet, iwarp, etc?
>
> Hal?
>
> Jason
> --

As there may be multiple device type in the same host, I don't think it'd be that easy.

Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that).
This is not required for a lot of the umad API as they only list stuff in sysfs.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe April 5, 2018, 4:05 p.m. UTC | #5
On Thu, Apr 05, 2018 at 05:40:46PM +0200, Nicolas Morey-Chaisemartin wrote:
> 
> 
> On 04/05/2018 05:20 PM, Jason Gunthorpe wrote:
> > On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote:
> >>
> >> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote:
> >>>
> >>> Is this just a userspace bug in ibstat or do roce ports actually implement umad?
> >>>
> >>> Jason
> >> It seems like they don't.  It's just ibstat doing a umad_init()
> >> before making few calls to umad_get_cas_names() and a few other that
> >> seem to work through simple reads in sysfs (not umad related).
> >>
> >> Not sure what the clean way to fix this is. Removing the call to
> >> umad_init feels dirty but it's simple enough.  Any ideas ?
> > Maybe we should make umad_init not open the umad device if the link
> > layer is ethernet, iwarp, etc?
> >
> > Hal?
> >
> > Jason
> 
> As there may be multiple device type in the same host, I don't think it'd be that easy.
> 
> Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that).
> This is not required for a lot of the umad API as they only list stuff in sysfs.

It is not umad_init we need to remove, but umad_open_port should not
be called if the link layer is ethernet.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Morey-Chaisemartin April 5, 2018, 4:34 p.m. UTC | #6
On 04/05/2018 06:05 PM, Jason Gunthorpe wrote:
> On Thu, Apr 05, 2018 at 05:40:46PM +0200, Nicolas Morey-Chaisemartin wrote:
>>
>> On 04/05/2018 05:20 PM, Jason Gunthorpe wrote:
>>> On Thu, Apr 05, 2018 at 08:35:26AM +0200, Nicolas Morey-Chaisemartin wrote:
>>>> On 04/05/2018 12:21 AM, Jason Gunthorpe wrote:
>>>>> Is this just a userspace bug in ibstat or do roce ports actually implement umad?
>>>>>
>>>>> Jason
>>>> It seems like they don't.  It's just ibstat doing a umad_init()
>>>> before making few calls to umad_get_cas_names() and a few other that
>>>> seem to work through simple reads in sysfs (not umad related).
>>>>
>>>> Not sure what the clean way to fix this is. Removing the call to
>>>> umad_init feels dirty but it's simple enough.  Any ideas ?
>>> Maybe we should make umad_init not open the umad device if the link
>>> layer is ethernet, iwarp, etc?
>>>
>>> Hal?
>>>
>>> Jason
>> As there may be multiple device type in the same host, I don't think it'd be that easy.
>>
>> Right now the only thing umad_init() is doind is checking version against the kernel ABI (or something like that).
>> This is not required for a lot of the umad API as they only list stuff in sysfs.
> It is not umad_init we need to remove, but umad_open_port should not
> be called if the link layer is ethernet.
>
> Jason

For ibstat, I don't think it is calling umad_open_port.
It's only call to umad are umad_init, umad_get_cas_names, umad_get_ca, umad_get_ca_portguid and it doesn't seem like any of these call umad_open_port.
The failure from ibstat is due to umad_init which fails because there is no /sys/class/infiniband_mad/abi_version as we only have RoCE hw and the ib_umad module is not automatically loaded.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel-boot/modules/roce.conf b/kernel-boot/modules/roce.conf
index 8e4927ce26f0..982b929f429a 100644
--- a/kernel-boot/modules/roce.conf
+++ b/kernel-boot/modules/roce.conf
@@ -1,2 +1,5 @@ 
 # These modules are loaded by the system if any RDMA over Converged Ethernet
 # device is installed
+
+# Access to fabric management SMPs and GMPs from userspace.
+ib_umad