diff mbox series

RDMA/siw,rxe: Make emulated devices virtual in the device tree

Message ID 0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/siw,rxe: Make emulated devices virtual in the device tree | expand

Commit Message

Jason Gunthorpe Nov. 6, 2020, 2 p.m. UTC
This moves siw and rxe to be virtual devices in the device tree:

lrwxrwxrwx 1 root root 0 Nov  6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/

Previously they were trying to parent themselves to the physical device of
their attached netdev, which doesn't make alot of sense.

My hope is this will solve some weird syzkaller hits related to sysfs as
it could be possible that the parent of a netdev is another netdev, eg
under bonding or some other syzkaller found netdev configuration.

Nesting a ib_device under anything but a physical device is going to cause
inconsistencies in sysfs during destructions.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/sw/rxe/rxe_net.c   | 12 ------------
 drivers/infiniband/sw/rxe/rxe_verbs.c |  1 -
 drivers/infiniband/sw/siw/siw_main.c  | 19 +------------------
 3 files changed, 1 insertion(+), 31 deletions(-)

Comments

Bart Van Assche Nov. 7, 2020, 1:04 a.m. UTC | #1
On 11/6/20 6:00 AM, Jason Gunthorpe wrote:
> This moves siw and rxe to be virtual devices in the device tree:
> 
> lrwxrwxrwx 1 root root 0 Nov  6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/
> 
> Previously they were trying to parent themselves to the physical device of
> their attached netdev, which doesn't make alot of sense.
> 
> My hope is this will solve some weird syzkaller hits related to sysfs as
> it could be possible that the parent of a netdev is another netdev, eg
> under bonding or some other syzkaller found netdev configuration.
> 
> Nesting a ib_device under anything but a physical device is going to cause
> inconsistencies in sysfs during destructions.

Hi Jason,

I do not know enough about the code touched by this patch to comment on
the patch itself. But I expect that the blktests code will have to be
modified to compensate for this change. How to translate the name of a
virtual RDMA device into a netdev device with this patch applied?

From the blktests project:

# Check whether or not an rdma_rxe instance has been associated with
# network interface $1.
has_rdma_rxe() {
	local f

	for f in /sys/class/infiniband/*/parent; do
		if [ -e "$f" ] && [ "$(<"$f")" = "$1" ]; then
			return 0
		fi
	done

	return 1
}

rdma_dev_to_net_dev() {
	local b d rdma_dev=$1

	b=/sys/class/infiniband/$rdma_dev/parent
	if [ -e "$b" ]; then
		echo "$(<"$b")"
	else
		echo "${rdma_dev%_siw}"
	fi
}

Bart.
Jason Gunthorpe Nov. 9, 2020, 8:54 p.m. UTC | #2
On Fri, Nov 06, 2020 at 05:04:24PM -0800, Bart Van Assche wrote:
> On 11/6/20 6:00 AM, Jason Gunthorpe wrote:
> > This moves siw and rxe to be virtual devices in the device tree:
> > 
> > lrwxrwxrwx 1 root root 0 Nov  6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/
> > 
> > Previously they were trying to parent themselves to the physical device of
> > their attached netdev, which doesn't make alot of sense.
> > 
> > My hope is this will solve some weird syzkaller hits related to sysfs as
> > it could be possible that the parent of a netdev is another netdev, eg
> > under bonding or some other syzkaller found netdev configuration.
> > 
> > Nesting a ib_device under anything but a physical device is going to cause
> > inconsistencies in sysfs during destructions.
> 
> Hi Jason,
> 
> I do not know enough about the code touched by this patch to comment on
> the patch itself. But I expect that the blktests code will have to be
> modified to compensate for this change. How to translate the name of a
> virtual RDMA device into a netdev device with this patch applied?

$ rdma link
link rxe0/1 state ACTIVE physical_state LINK_UP netdev eth1 

Is the correct way

Jason
Jason Gunthorpe Nov. 23, 2020, 8:15 p.m. UTC | #3
On Fri, Nov 06, 2020 at 10:00:49AM -0400, Jason Gunthorpe wrote:
> This moves siw and rxe to be virtual devices in the device tree:
> 
> lrwxrwxrwx 1 root root 0 Nov  6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/
> 
> Previously they were trying to parent themselves to the physical device of
> their attached netdev, which doesn't make alot of sense.
> 
> My hope is this will solve some weird syzkaller hits related to sysfs as
> it could be possible that the parent of a netdev is another netdev, eg
> under bonding or some other syzkaller found netdev configuration.
> 
> Nesting a ib_device under anything but a physical device is going to cause
> inconsistencies in sysfs during destructions.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_net.c   | 12 ------------
>  drivers/infiniband/sw/rxe/rxe_verbs.c |  1 -
>  drivers/infiniband/sw/siw/siw_main.c  | 19 +------------------
>  3 files changed, 1 insertion(+), 31 deletions(-)

Applied to for-next

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 575e1a4ec82121..2b4238cdeab953 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -20,18 +20,6 @@ 
 
 static struct rxe_recv_sockets recv_sockets;
 
-struct device *rxe_dma_device(struct rxe_dev *rxe)
-{
-	struct net_device *ndev;
-
-	ndev = rxe->ndev;
-
-	if (is_vlan_dev(ndev))
-		ndev = vlan_dev_real_dev(ndev);
-
-	return ndev->dev.parent;
-}
-
 int rxe_mcast_add(struct rxe_dev *rxe, union ib_gid *mgid)
 {
 	int err;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 209c7b3fab97a2..0cc4116d9a1fa6 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -1134,7 +1134,6 @@  int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
 	dev->node_type = RDMA_NODE_IB_CA;
 	dev->phys_port_cnt = 1;
 	dev->num_comp_vectors = num_possible_cpus();
-	dev->dev.parent = rxe_dma_device(rxe);
 	dev->local_dma_lkey = 0;
 	addrconf_addr_eui48((unsigned char *)&dev->node_guid,
 			    rxe->ndev->dev_addr);
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 9cf596429dbf7d..97cf43bf0244cd 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -305,24 +305,8 @@  static struct siw_device *siw_device_create(struct net_device *netdev)
 {
 	struct siw_device *sdev = NULL;
 	struct ib_device *base_dev;
-	struct device *parent = netdev->dev.parent;
 	int rv;
 
-	if (!parent) {
-		/*
-		 * The loopback device has no parent device,
-		 * so it appears as a top-level device. To support
-		 * loopback device connectivity, take this device
-		 * as the parent device. Skip all other devices
-		 * w/o parent device.
-		 */
-		if (netdev->type != ARPHRD_LOOPBACK) {
-			pr_warn("siw: device %s error: no parent device\n",
-				netdev->name);
-			return NULL;
-		}
-		parent = &netdev->dev;
-	}
 	sdev = ib_alloc_device(siw_device, base_dev);
 	if (!sdev)
 		return NULL;
@@ -359,7 +343,6 @@  static struct siw_device *siw_device_create(struct net_device *netdev)
 	 * per physical port.
 	 */
 	base_dev->phys_port_cnt = 1;
-	base_dev->dev.parent = parent;
 	base_dev->dev.dma_parms = &sdev->dma_parms;
 	dma_set_max_seg_size(&base_dev->dev, UINT_MAX);
 	dma_set_coherent_mask(&base_dev->dev,
@@ -405,7 +388,7 @@  static struct siw_device *siw_device_create(struct net_device *netdev)
 	atomic_set(&sdev->num_mr, 0);
 	atomic_set(&sdev->num_pd, 0);
 
-	sdev->numa_node = dev_to_node(parent);
+	sdev->numa_node = dev_to_node(&netdev->dev);
 	spin_lock_init(&sdev->lock);
 
 	return sdev;