diff mbox series

RDMA/rxe: prevent rxe creation on top of vlan interface

Message ID 20200811150415.3693-1-goody698@gmail.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/rxe: prevent rxe creation on top of vlan interface | expand

Commit Message

mohammad Aug. 11, 2020, 3:04 p.m. UTC
Creating rxe device on top of vlan interface will create a non-functional device
that has an empty gids table and can't be used for rdma cm communication.

This is caused by the logic in enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(),
which only considers networks connected to "upper devices" of the configured
network device, resulting in an empty set of gids for a vlan interface,
and attempts to connect via this rdma device fail in cm_init_av_for_response
because no gids can be resolved.

apparently, this behavior was implemented to fit the HW-RoCE devices that create
RoCE device per port, therefore RXE must behave the same like HW-RoCE devices
and create rxe device per real device only.

In order to communicate via a vlan interface, the user must use the gid index of
the vlan address instead of creating rxe over vlan.

Signed-off-by: Mohammad Heib <goody698@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe.c       | 6 ++++++
 drivers/infiniband/sw/rxe/rxe_sysfs.c | 6 ++++++
 2 files changed, 12 insertions(+)

Comments

Zhu Yanjun Aug. 13, 2020, 3:16 a.m. UTC | #1
On 8/11/2020 11:04 PM, Mohammad Heib wrote:
> Creating rxe device on top of vlan interface will create a non-functional device
> that has an empty gids table and can't be used for rdma cm communication.

Can we fix this empty gids table?

So this gids table can be used for rdma cm communication.

Zhu Yanjun

>
> This is caused by the logic in enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(),
> which only considers networks connected to "upper devices" of the configured
> network device, resulting in an empty set of gids for a vlan interface,
> and attempts to connect via this rdma device fail in cm_init_av_for_response
> because no gids can be resolved.
>
> apparently, this behavior was implemented to fit the HW-RoCE devices that create
> RoCE device per port, therefore RXE must behave the same like HW-RoCE devices
> and create rxe device per real device only.
>
> In order to communicate via a vlan interface, the user must use the gid index of
> the vlan address instead of creating rxe over vlan.
>
> Signed-off-by: Mohammad Heib <goody698@gmail.com>
> ---
>   drivers/infiniband/sw/rxe/rxe.c       | 6 ++++++
>   drivers/infiniband/sw/rxe/rxe_sysfs.c | 6 ++++++
>   2 files changed, 12 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
> index 5642eefb4ba1..d2076aa7a732 100644
> --- a/drivers/infiniband/sw/rxe/rxe.c
> +++ b/drivers/infiniband/sw/rxe/rxe.c
> @@ -310,6 +310,12 @@ static int rxe_newlink(const char *ibdev_name, struct net_device *ndev)
>   	struct rxe_dev *exists;
>   	int err = 0;
>   
> +	if (is_vlan_dev(ndev)) {
> +		pr_err("rxe creation allowed on top of a real device only\n");
> +		err = -EPERM;
> +		goto err;
> +	}
> +
>   	exists = rxe_get_dev_from_net(ndev);
>   	if (exists) {
>   		ib_device_put(&exists->ib_dev);
> diff --git a/drivers/infiniband/sw/rxe/rxe_sysfs.c b/drivers/infiniband/sw/rxe/rxe_sysfs.c
> index ccda5f5a3bc0..0a083c3d900a 100644
> --- a/drivers/infiniband/sw/rxe/rxe_sysfs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_sysfs.c
> @@ -73,6 +73,12 @@ static int rxe_param_set_add(const char *val, const struct kernel_param *kp)
>   		return -EINVAL;
>   	}
>   
> +	if (is_vlan_dev(ndev)) {
> +		pr_err("rxe creation allowed on top of a real device only\n");
> +		err = -EPERM;
> +		goto err;
> +	}
> +
>   	exists = rxe_get_dev_from_net(ndev);
>   	if (exists) {
>   		ib_device_put(&exists->ib_dev);
mohammad Aug. 13, 2020, 9:52 a.m. UTC | #2
On 8/13/20 6:16 AM, Zhu Yanjun wrote:
> On 8/11/2020 11:04 PM, Mohammad Heib wrote:
>> Creating rxe device on top of vlan interface will create a 
>> non-functional device
>> that has an empty gids table and can't be used for rdma cm 
>> communication.
>
> Can we fix this empty gids table?

I'm not sure if that can be done in the rxe only, since the IP's of the 
netdev enrolls into

the gids table by the ibcore in the device registration 
stage(gid_table_setup_one) and that requires making changes in the ibcore.


we can enroll those IP's into the device gids table after the rxe device 
registration but still

we have to expose some ibcore functions (add_netdev_ips) and track the 
address change events of the

vlan interface and update the gids table accordingly to those IP's changes.

>
> So this gids table can be used for rdma cm communication.
>
> Zhu Yanjun
>
>>
>> This is caused by the logic in 
>> enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(),
>> which only considers networks connected to "upper devices" of the 
>> configured
>> network device, resulting in an empty set of gids for a vlan interface,
>> and attempts to connect via this rdma device fail in 
>> cm_init_av_for_response
>> because no gids can be resolved.
>>
>> apparently, this behavior was implemented to fit the HW-RoCE devices 
>> that create
>> RoCE device per port, therefore RXE must behave the same like HW-RoCE 
>> devices
>> and create rxe device per real device only.
>>
>> In order to communicate via a vlan interface, the user must use the 
>> gid index of
>> the vlan address instead of creating rxe over vlan.
>>
>> Signed-off-by: Mohammad Heib <goody698@gmail.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe.c       | 6 ++++++
>>   drivers/infiniband/sw/rxe/rxe_sysfs.c | 6 ++++++
>>   2 files changed, 12 insertions(+)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe.c 
>> b/drivers/infiniband/sw/rxe/rxe.c
>> index 5642eefb4ba1..d2076aa7a732 100644
>> --- a/drivers/infiniband/sw/rxe/rxe.c
>> +++ b/drivers/infiniband/sw/rxe/rxe.c
>> @@ -310,6 +310,12 @@ static int rxe_newlink(const char *ibdev_name, 
>> struct net_device *ndev)
>>       struct rxe_dev *exists;
>>       int err = 0;
>>   +    if (is_vlan_dev(ndev)) {
>> +        pr_err("rxe creation allowed on top of a real device only\n");
>> +        err = -EPERM;
>> +        goto err;
>> +    }
>> +
>>       exists = rxe_get_dev_from_net(ndev);
>>       if (exists) {
>>           ib_device_put(&exists->ib_dev);
>> diff --git a/drivers/infiniband/sw/rxe/rxe_sysfs.c 
>> b/drivers/infiniband/sw/rxe/rxe_sysfs.c
>> index ccda5f5a3bc0..0a083c3d900a 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_sysfs.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_sysfs.c
>> @@ -73,6 +73,12 @@ static int rxe_param_set_add(const char *val, 
>> const struct kernel_param *kp)
>>           return -EINVAL;
>>       }
>>   +    if (is_vlan_dev(ndev)) {
>> +        pr_err("rxe creation allowed on top of a real device only\n");
>> +        err = -EPERM;
>> +        goto err;
>> +    }
>> +
>>       exists = rxe_get_dev_from_net(ndev);
>>       if (exists) {
>>           ib_device_put(&exists->ib_dev);
>
>
Zhu Yanjun Aug. 13, 2020, 4:13 p.m. UTC | #3
On 8/13/2020 5:52 PM, mohammad wrote:
>
> On 8/13/20 6:16 AM, Zhu Yanjun wrote:
>> On 8/11/2020 11:04 PM, Mohammad Heib wrote:
>>> Creating rxe device on top of vlan interface will create a 
>>> non-functional device
>>> that has an empty gids table and can't be used for rdma cm 
>>> communication.
>>
>> Can we fix this empty gids table?
>
> I'm not sure if that can be done in the rxe only, since the IP's of 
> the netdev enrolls into
>
> the gids table by the ibcore in the device registration 
> stage(gid_table_setup_one) and that requires making changes in the 
> ibcore.
>
>
> we can enroll those IP's into the device gids table after the rxe 
> device registration but still
>
> we have to expose some ibcore functions (add_netdev_ips) and track the 
> address change events of the
>
> vlan interface and update the gids table accordingly to those IP's 
> changes.

Thanks a lot for your explanations.

Not sure if a patch containing the above is better than just preventing 
rxe creation on top of vlan interface.

Zhu Yanjun

>
>>
>> So this gids table can be used for rdma cm communication.
>>
>> Zhu Yanjun
>>
>>>
>>> This is caused by the logic in 
>>> enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(),
>>> which only considers networks connected to "upper devices" of the 
>>> configured
>>> network device, resulting in an empty set of gids for a vlan interface,
>>> and attempts to connect via this rdma device fail in 
>>> cm_init_av_for_response
>>> because no gids can be resolved.
>>>
>>> apparently, this behavior was implemented to fit the HW-RoCE devices 
>>> that create
>>> RoCE device per port, therefore RXE must behave the same like 
>>> HW-RoCE devices
>>> and create rxe device per real device only.
>>>
>>> In order to communicate via a vlan interface, the user must use the 
>>> gid index of
>>> the vlan address instead of creating rxe over vlan.
>>>
>>> Signed-off-by: Mohammad Heib <goody698@gmail.com>
>>> ---
>>>   drivers/infiniband/sw/rxe/rxe.c       | 6 ++++++
>>>   drivers/infiniband/sw/rxe/rxe_sysfs.c | 6 ++++++
>>>   2 files changed, 12 insertions(+)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe.c 
>>> b/drivers/infiniband/sw/rxe/rxe.c
>>> index 5642eefb4ba1..d2076aa7a732 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe.c
>>> @@ -310,6 +310,12 @@ static int rxe_newlink(const char *ibdev_name, 
>>> struct net_device *ndev)
>>>       struct rxe_dev *exists;
>>>       int err = 0;
>>>   +    if (is_vlan_dev(ndev)) {
>>> +        pr_err("rxe creation allowed on top of a real device only\n");
>>> +        err = -EPERM;
>>> +        goto err;
>>> +    }
>>> +
>>>       exists = rxe_get_dev_from_net(ndev);
>>>       if (exists) {
>>>           ib_device_put(&exists->ib_dev);
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_sysfs.c 
>>> b/drivers/infiniband/sw/rxe/rxe_sysfs.c
>>> index ccda5f5a3bc0..0a083c3d900a 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_sysfs.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_sysfs.c
>>> @@ -73,6 +73,12 @@ static int rxe_param_set_add(const char *val, 
>>> const struct kernel_param *kp)
>>>           return -EINVAL;
>>>       }
>>>   +    if (is_vlan_dev(ndev)) {
>>> +        pr_err("rxe creation allowed on top of a real device only\n");
>>> +        err = -EPERM;
>>> +        goto err;
>>> +    }
>>> +
>>>       exists = rxe_get_dev_from_net(ndev);
>>>       if (exists) {
>>>           ib_device_put(&exists->ib_dev);
>>
>>
Jason Gunthorpe Aug. 24, 2020, 5:14 p.m. UTC | #4
On Tue, Aug 11, 2020 at 06:04:15PM +0300, Mohammad Heib wrote:
> Creating rxe device on top of vlan interface will create a non-functional device
> that has an empty gids table and can't be used for rdma cm communication.
> 
> This is caused by the logic in enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(),
> which only considers networks connected to "upper devices" of the configured
> network device, resulting in an empty set of gids for a vlan interface,
> and attempts to connect via this rdma device fail in cm_init_av_for_response
> because no gids can be resolved.
> 
> apparently, this behavior was implemented to fit the HW-RoCE devices that create
> RoCE device per port, therefore RXE must behave the same like HW-RoCE devices
> and create rxe device per real device only.
> 
> In order to communicate via a vlan interface, the user must use the gid index of
> the vlan address instead of creating rxe over vlan.
> 
> Signed-off-by: Mohammad Heib <goody698@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe.c       | 6 ++++++
>  drivers/infiniband/sw/rxe/rxe_sysfs.c | 6 ++++++
>  2 files changed, 12 insertions(+)

It would be better to somehow fix things so the gid table was properly
populated, but until that is done, blocking it is a reasonable thing
to do.

Applied to for-next, thanks

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 5642eefb4ba1..d2076aa7a732 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -310,6 +310,12 @@  static int rxe_newlink(const char *ibdev_name, struct net_device *ndev)
 	struct rxe_dev *exists;
 	int err = 0;
 
+	if (is_vlan_dev(ndev)) {
+		pr_err("rxe creation allowed on top of a real device only\n");
+		err = -EPERM;
+		goto err;
+	}
+
 	exists = rxe_get_dev_from_net(ndev);
 	if (exists) {
 		ib_device_put(&exists->ib_dev);
diff --git a/drivers/infiniband/sw/rxe/rxe_sysfs.c b/drivers/infiniband/sw/rxe/rxe_sysfs.c
index ccda5f5a3bc0..0a083c3d900a 100644
--- a/drivers/infiniband/sw/rxe/rxe_sysfs.c
+++ b/drivers/infiniband/sw/rxe/rxe_sysfs.c
@@ -73,6 +73,12 @@  static int rxe_param_set_add(const char *val, const struct kernel_param *kp)
 		return -EINVAL;
 	}
 
+	if (is_vlan_dev(ndev)) {
+		pr_err("rxe creation allowed on top of a real device only\n");
+		err = -EPERM;
+		goto err;
+	}
+
 	exists = rxe_get_dev_from_net(ndev);
 	if (exists) {
 		ib_device_put(&exists->ib_dev);