Message ID | 20201029212545.6616-1-rpearson@hpe.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-next] RDMA/rxe: fix regression caused by recent patch | expand |
Hi Bob, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on rdma/for-next] [also build test WARNING on v5.10-rc1 next-20201029] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848 base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next config: powerpc-allyesconfig (attached as .config) compiler: powerpc64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/880fe509bd2bdc73c885fd887cb3935000855d49 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848 git checkout 880fe509bd2bdc73c885fd887cb3935000855d49 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): drivers/infiniband/sw/rxe/rxe_verbs.c: In function 'rxe_register_device': >> drivers/infiniband/sw/rxe/rxe_verbs.c:1143:20: warning: assignment to 'u64 *' {aka 'long long unsigned int *'} from 'long long unsigned int' makes pointer from integer without a cast [-Wint-conversion] 1143 | dev->dev.dma_mask = DMA_BIT_MASK(64); | ^ vim +1143 drivers/infiniband/sw/rxe/rxe_verbs.c 1125 1126 int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) 1127 { 1128 int err; 1129 struct ib_device *dev = &rxe->ib_dev; 1130 struct crypto_shash *tfm; 1131 1132 strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc)); 1133 1134 dev->node_type = RDMA_NODE_IB_CA; 1135 dev->phys_port_cnt = 1; 1136 dev->num_comp_vectors = num_possible_cpus(); 1137 1138 /* rdma_rxe never does real DMA but does rely on 1139 * pinning user memory in MRs to avoid page faults 1140 * in responder and completer tasklets 1141 */ 1142 dev->dev.parent = rxe_dma_device(rxe); > 1143 dev->dev.dma_mask = DMA_BIT_MASK(64); --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Fri, Oct 30, 2020 at 5:27 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > The commit referenced below performs additional checking on > devices used for DMA. Specifically it checks that > > device->dma_mask != NULL > > Rdma_rxe uses this device when pinning MR memory but did not > set the value of dma_mask. In fact rdma_rxe does not perform > any DMA operations so the value is never used but is checked. > > This patch gives dma_mask a valid value. Without this patch > rdma_rxe does not function at all. > > Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") > Signed-off-by: Bob Pearson <rpearson@hpe.com> Thanks a lot. Zhu Yanjun > --- > drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 7652d53af2c1..116a234e92db 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.c > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) > dev->node_type = RDMA_NODE_IB_CA; > dev->phys_port_cnt = 1; > dev->num_comp_vectors = num_possible_cpus(); > + > + /* rdma_rxe never does real DMA but does rely on > + * pinning user memory in MRs to avoid page faults > + * in responder and completer tasklets > + */ > dev->dev.parent = rxe_dma_device(rxe); > + dev->dev.dma_mask = DMA_BIT_MASK(64); > dev->local_dma_lkey = 0; > + > addrconf_addr_eui48((unsigned char *)&dev->node_guid, > rxe->ndev->dev_addr); > dev->dev.dma_parms = &rxe->dma_parms; > -- > 2.27.0 >
On 10/29/20 4:25 PM, Bob Pearson wrote: > The commit referenced below performs additional checking on > devices used for DMA. Specifically it checks that > > device->dma_mask != NULL > > Rdma_rxe uses this device when pinning MR memory but did not > set the value of dma_mask. In fact rdma_rxe does not perform > any DMA operations so the value is never used but is checked. > > This patch gives dma_mask a valid value. Without this patch > rdma_rxe does not function at all. > > Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") > Signed-off-by: Bob Pearson <rpearson@hpe.com> > --- > drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 7652d53af2c1..116a234e92db 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.c > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) > dev->node_type = RDMA_NODE_IB_CA; > dev->phys_port_cnt = 1; > dev->num_comp_vectors = num_possible_cpus(); > + > + /* rdma_rxe never does real DMA but does rely on > + * pinning user memory in MRs to avoid page faults > + * in responder and completer tasklets > + */ > dev->dev.parent = rxe_dma_device(rxe); > + dev->dev.dma_mask = DMA_BIT_MASK(64); > dev->local_dma_lkey = 0; > + > addrconf_addr_eui48((unsigned char *)&dev->node_guid, > rxe->ndev->dev_addr); > dev->dev.dma_parms = &rxe->dma_parms; > Ignore this patch. It turns out it works because any nonzero number in dma_mask will stop the check that is failing and since rxe never uses DMA it won't affect anything. But, it doesn't compile cleanly because the dma_mask is a pointer to the actual dma_mask and not the mask. Somehow I missed the warning. I have a newer version that uses the function dma_coerce_mask_and_coherent() and also works. (Works means it gets to the next problem as mentioned in the prvious note.) Bob
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 7652d53af2c1..116a234e92db 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) dev->node_type = RDMA_NODE_IB_CA; dev->phys_port_cnt = 1; dev->num_comp_vectors = num_possible_cpus(); + + /* rdma_rxe never does real DMA but does rely on + * pinning user memory in MRs to avoid page faults + * in responder and completer tasklets + */ dev->dev.parent = rxe_dma_device(rxe); + dev->dev.dma_mask = DMA_BIT_MASK(64); dev->local_dma_lkey = 0; + addrconf_addr_eui48((unsigned char *)&dev->node_guid, rxe->ndev->dev_addr); dev->dev.dma_parms = &rxe->dma_parms;
The commit referenced below performs additional checking on devices used for DMA. Specifically it checks that device->dma_mask != NULL Rdma_rxe uses this device when pinning MR memory but did not set the value of dma_mask. In fact rdma_rxe does not perform any DMA operations so the value is never used but is checked. This patch gives dma_mask a valid value. Without this patch rdma_rxe does not function at all. Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") Signed-off-by: Bob Pearson <rpearson@hpe.com> --- drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++ 1 file changed, 7 insertions(+)