Message ID | 20231017053738.226069-1-lizhijian@fujitsu.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [rdma-core] libibverbs/man/ibv_reg_mr.3: Document errno on failure | expand |
Li Zhijian <lizhijian@fujitsu.com> writes: > 'errno' is being widely used by applications when ibv_reg_mr returns NULL. > They all believe errno indicates the error on failure, so let's document > it explicitly. Similar issue with ibv_open_device() . Possibly more. > Reported-by: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > libibverbs/man/ibv_reg_mr.3 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3 > index 8f323891..d43799c5 100644 > --- a/libibverbs/man/ibv_reg_mr.3 > +++ b/libibverbs/man/ibv_reg_mr.3 > @@ -103,7 +103,7 @@ deregisters the MR > .I mr\fR. > .SH "RETURN VALUE" > .B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr() > -returns a pointer to the registered MR, or NULL if the request fails. > +returns a pointer to the registered MR, or NULL if the request fails (and sets errno to indicate the failure reason). > The local key (\fBL_Key\fR) field > .B lkey > is used as the lkey field of struct ibv_sge when posting buffers with We should double-check errno is set on all failures. I doubt it is. ibv_reg_mr() is a macro: #define ibv_reg_mr(pd, addr, length, access) \ __ibv_reg_mr(pd, addr, length, access, \ __builtin_constant_p( \ ((int)(access) & IBV_ACCESS_OPTIONAL_RANGE) == 0)) __ibv_reg_mr() may call ibv_reg_mr_iova2(): __attribute__((__always_inline__)) static inline struct ibv_mr * __ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, unsigned int access, int is_access_const) { if (is_access_const && (access & IBV_ACCESS_OPTIONAL_RANGE) == 0) return ibv_reg_mr(pd, addr, length, (int)access); else return ibv_reg_mr_iova2(pd, addr, length, (uintptr_t)addr, access); } ibv_reg_mr_iova2() doesn't seem to set errno at --->: struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, unsigned int access) { struct verbs_device *device = verbs_get_device(pd->context->device); bool odp_mr = access & IBV_ACCESS_ON_DEMAND; struct ibv_mr *mr; if (!(device->core_support & IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS)) access &= ~IBV_ACCESS_OPTIONAL_RANGE; if (!odp_mr && ibv_dontfork_range(addr, length)) ---> return NULL; mr = get_ops(pd->context)->reg_mr(pd, addr, length, iova, access); if (mr) { mr->context = pd->context; mr->pd = pd; mr->addr = addr; mr->length = length; } else { if (!odp_mr) ibv_dofork_range(addr, length); } return mr; } Thanks!
On 17/10/2023 16:01, Markus Armbruster wrote: > Li Zhijian <lizhijian@fujitsu.com> writes: > >> 'errno' is being widely used by applications when ibv_reg_mr returns NULL. >> They all believe errno indicates the error on failure, so let's document >> it explicitly. > > Similar issue with ibv_open_device() . Possibly more. You are right, ibv_open_device()'s call chains are more complicated, I have not figured out if it ought to set errno though QEMU relies on it. > >> Reported-by: Markus Armbruster <armbru@redhat.com> >> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> >> --- >> libibverbs/man/ibv_reg_mr.3 | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3 >> index 8f323891..d43799c5 100644 >> --- a/libibverbs/man/ibv_reg_mr.3 >> +++ b/libibverbs/man/ibv_reg_mr.3 >> @@ -103,7 +103,7 @@ deregisters the MR >> .I mr\fR. >> .SH "RETURN VALUE" >> .B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr() >> -returns a pointer to the registered MR, or NULL if the request fails. >> +returns a pointer to the registered MR, or NULL if the request fails (and sets errno to indicate the failure reason). >> The local key (\fBL_Key\fR) field >> .B lkey >> is used as the lkey field of struct ibv_sge when posting buffers with > > We should double-check errno is set on all failures. I doubt it is. > > ibv_reg_mr() is a macro: > > #define ibv_reg_mr(pd, addr, length, access) \ > __ibv_reg_mr(pd, addr, length, access, \ > __builtin_constant_p( \ > ((int)(access) & IBV_ACCESS_OPTIONAL_RANGE) == 0)) > > __ibv_reg_mr() may call ibv_reg_mr_iova2(): > > __attribute__((__always_inline__)) static inline struct ibv_mr * > __ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, unsigned int access, > int is_access_const) > { > if (is_access_const && (access & IBV_ACCESS_OPTIONAL_RANGE) == 0) > return ibv_reg_mr(pd, addr, length, (int)access); > else > return ibv_reg_mr_iova2(pd, addr, length, (uintptr_t)addr, > access); > } > > ibv_reg_mr_iova2() doesn't seem to set errno at --->: > > struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length, > uint64_t iova, unsigned int access) > { > struct verbs_device *device = verbs_get_device(pd->context->device); > bool odp_mr = access & IBV_ACCESS_ON_DEMAND; > struct ibv_mr *mr; > > if (!(device->core_support & IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS)) > access &= ~IBV_ACCESS_OPTIONAL_RANGE; > > if (!odp_mr && ibv_dontfork_range(addr, length)) > ---> return NULL; It seems that ibv_dontfork_range() are missing to set errno when - get_start_node() fails // ENOMEN only - do_madvise() fails and 'rolling_back = 1', since this condition will 'goto again' we may need to save this errno before 'goto again', and assign it to errno before return. if (ret) { node = undo_node(node, start, inc); if (rolling_back || !node) goto out; /* madvise failed, roll back previous changes */ rolling_back = 1; advice = advice == MADV_DONTFORK ? MADV_DOFORK : MADV_DONTFORK; end = node->end; goto again; } Thanks > > mr = get_ops(pd->context)->reg_mr(pd, addr, length, iova, access); > if (mr) { > mr->context = pd->context; > mr->pd = pd; > mr->addr = addr; > mr->length = length; > } else { > if (!odp_mr) > ibv_dofork_range(addr, length); > } > > return mr; > } > > > Thanks! >
"Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes: > On 17/10/2023 16:01, Markus Armbruster wrote: >> Li Zhijian <lizhijian@fujitsu.com> writes: >> >>> 'errno' is being widely used by applications when ibv_reg_mr returns NULL. >>> They all believe errno indicates the error on failure, so let's document >>> it explicitly. >> >> Similar issue with ibv_open_device() . Possibly more. > > You are right, ibv_open_device()'s call chains are more complicated, > I have not figured out if it ought to set errno though QEMU relies on it. I think a question to answer is for what purposes callers need errno. The only callers I know are in QEMU. There are three: * qemu_rdma_reg_whole_ram_blocks() and qemu_rdma_register_and_get_keys() When ibv_reg_mr() fails, maybe try again with IBV_ACCESS_ON_DEMAND added to the protection attributes. "Maybe": if errno is ENOTSUP, and ibv_query_device_ex() reports IBV_ODP_SUPPORT. * qemu_rdma_broken_ipv6_kernel() This function appears to probe the devices returned by ibv_get_device_list(). For each device in the list, in order: try to ibv_open_device(). If it fails: ignore the device if errno is EPERM, else return failure. I'm not familiar with RDMA, and I can't say whether any of this makes sense. If it doesn't, we need to talk about what problem the QEMU code is trying to solve, and how to solve it properly. If it does, we have legitimate uses of errno, and we need to talk how to make errno usable safely, or else how to replace its use in QEMU.
On 18/10/2023 12:45, Markus Armbruster wrote: > "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes: > >> On 17/10/2023 16:01, Markus Armbruster wrote: >>> Li Zhijian <lizhijian@fujitsu.com> writes: >>> >>>> 'errno' is being widely used by applications when ibv_reg_mr returns NULL. >>>> They all believe errno indicates the error on failure, so let's document >>>> it explicitly. >>> >>> Similar issue with ibv_open_device() . Possibly more. >> >> You are right, ibv_open_device()'s call chains are more complicated, >> I have not figured out if it ought to set errno though QEMU relies on it. > > I think a question to answer is for what purposes callers need errno. > > The only callers I know are in QEMU. There are three: > > * qemu_rdma_reg_whole_ram_blocks() and qemu_rdma_register_and_get_keys() > > When ibv_reg_mr() fails, maybe try again with IBV_ACCESS_ON_DEMAND > added to the protection attributes. > > "Maybe": if errno is ENOTSUP, and ibv_query_device_ex() reports > IBV_ODP_SUPPORT. librpma[1] is another project that registers ODP MR like this. https://github.com/pmem/rpma/blob/f52c00d18821ac573a71e9f23a6d2e8695086e95/src/peer.c#L277 ibv_reg_mr() will evolve to kernel via ioctl() generally, the when the libc wrapper will set the errno. > > * qemu_rdma_broken_ipv6_kernel() > > This function appears to probe the devices returned by > ibv_get_device_list(). > > For each device in the list, in order: try to ibv_open_device(). If > it fails: ignore the device if errno is EPERM, else return failure. DPDK read the errno after calling ibv_open_device()[2] and ibv_get_device_list()[3] [2] https://github.com/DPDK/dpdk/blob/5f9426b0618b7c2899f4d1444768f62739da1bce/drivers/net/mlx4/mlx4.c#L829 [3] https://github.com/DPDK/dpdk/blob/5f9426b0618b7c2899f4d1444768f62739da1bce/drivers/net/mlx4/mlx4.c#L802 I also think these APIs' intention are going to use the errno to indicate error reason, but they haven't been done yet? > > I'm not familiar with RDMA, and I can't say whether any of this makes > sense. > > If it doesn't, we need to talk about what problem the QEMU code is > trying to solve, and how to solve it properly. > > If it does, we have legitimate uses of errno, and we need to talk how to > make errno usable safely, or else how to replace its use in QEMU. >
diff --git a/libibverbs/man/ibv_reg_mr.3 b/libibverbs/man/ibv_reg_mr.3 index 8f323891..d43799c5 100644 --- a/libibverbs/man/ibv_reg_mr.3 +++ b/libibverbs/man/ibv_reg_mr.3 @@ -103,7 +103,7 @@ deregisters the MR .I mr\fR. .SH "RETURN VALUE" .B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr() -returns a pointer to the registered MR, or NULL if the request fails. +returns a pointer to the registered MR, or NULL if the request fails (and sets errno to indicate the failure reason). The local key (\fBL_Key\fR) field .B lkey is used as the lkey field of struct ibv_sge when posting buffers with
'errno' is being widely used by applications when ibv_reg_mr returns NULL. They all believe errno indicates the error on failure, so let's document it explicitly. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> --- libibverbs/man/ibv_reg_mr.3 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)