Message ID | 20240912-net-next-fix-get_netdev_rx_queue_index-v1-1-d73a1436be8c@kernel.org (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | 01787be49b25bdc3d64d491a194147f6a5ea46a9 |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [net-next] memory-provider: fix compilation issue without SYSFS | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 16 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf__only_bpftest_all_ | success | Success! ✅ |
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/10829149648 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/b906440906a3 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=889709 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
On Thu, Sep 12, 2024 at 3:25 AM Matthieu Baerts (NGI0) <matttbe@kernel.org> wrote: > > When CONFIG_SYSFS is not set, the kernel fails to compile: > > net/core/page_pool_user.c:368:45: error: implicit declaration of function 'get_netdev_rx_queue_index' [-Werror=implicit-function-declaration] > 368 | if (pool->slow.queue_idx == get_netdev_rx_queue_index(rxq)) { > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > > When CONFIG_SYSFS is not set, get_netdev_rx_queue_index() is not defined > as well. In this case, page_pool_check_memory_provider() cannot check > the memory provider, and a success answer can be returned instead. > Thanks Matthieu, and sorry about that. I have reproduced the build error and the fix resolves it. But... > Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider") > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > net/core/page_pool_user.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c > index 48335766c1bf..a98c0a76b33f 100644 > --- a/net/core/page_pool_user.c > +++ b/net/core/page_pool_user.c > @@ -353,6 +353,7 @@ void page_pool_unlist(struct page_pool *pool) > int page_pool_check_memory_provider(struct net_device *dev, > struct netdev_rx_queue *rxq) > { > +#ifdef CONFIG_SYSFS > struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv; > struct page_pool *pool; > struct hlist_node *n; > @@ -372,6 +373,9 @@ int page_pool_check_memory_provider(struct net_device *dev, > } > mutex_unlock(&page_pools_lock); > return -ENODATA; > +#else > + return 0; ...we can't assume success when we cannot check the memory provider. The memory provider check is somewhat critical; we rely on it to detect that the driver does not support memory providers or is not doing the right thing, and report that to the user. I don't think we can silently disable the check when the CONFIG_SYSFS is disabled. Please return -ENODATA or some other error here. If we disable devmem TCP for !CONFIG_SYSFS we should probably add something to the docs saying this. I can do that in a follow up change. However, I'm looking at the definition of get_netdev_rx_queue_index() and at first glance I don't see anything there that is actually dependent on CONFIG_SYSFS. Can we do this instead? I have build-tested it and it resolves the build issue as well: ``` diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h index ac34f5fb4f71..596836abf7bf 100644 --- a/include/net/netdev_rx_queue.h +++ b/include/net/netdev_rx_queue.h @@ -45,7 +45,6 @@ __netif_get_rx_queue(struct net_device *dev, unsigned int rxq) return dev->_rx + rxq; } -#ifdef CONFIG_SYSFS static inline unsigned int get_netdev_rx_queue_index(struct netdev_rx_queue *queue) { @@ -55,7 +54,6 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) BUG_ON(index >= dev->num_rx_queues); return index; } -#endif ``` Matthieu, I'm happy to follow up with v2 of this fix if you don't have time.
Hi Mina, Thank you for your reply! On 12/09/2024 14:49, Mina Almasry wrote: > On Thu, Sep 12, 2024 at 3:25 AM Matthieu Baerts (NGI0) > <matttbe@kernel.org> wrote: >> >> When CONFIG_SYSFS is not set, the kernel fails to compile: >> >> net/core/page_pool_user.c:368:45: error: implicit declaration of function 'get_netdev_rx_queue_index' [-Werror=implicit-function-declaration] >> 368 | if (pool->slow.queue_idx == get_netdev_rx_queue_index(rxq)) { >> | ^~~~~~~~~~~~~~~~~~~~~~~~~ >> >> When CONFIG_SYSFS is not set, get_netdev_rx_queue_index() is not defined >> as well. In this case, page_pool_check_memory_provider() cannot check >> the memory provider, and a success answer can be returned instead. >> > > Thanks Matthieu, and sorry about that. > > I have reproduced the build error and the fix resolves it. But... > >> Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider") >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >> --- >> net/core/page_pool_user.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c >> index 48335766c1bf..a98c0a76b33f 100644 >> --- a/net/core/page_pool_user.c >> +++ b/net/core/page_pool_user.c >> @@ -353,6 +353,7 @@ void page_pool_unlist(struct page_pool *pool) >> int page_pool_check_memory_provider(struct net_device *dev, >> struct netdev_rx_queue *rxq) >> { >> +#ifdef CONFIG_SYSFS >> struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv; >> struct page_pool *pool; >> struct hlist_node *n; >> @@ -372,6 +373,9 @@ int page_pool_check_memory_provider(struct net_device *dev, >> } >> mutex_unlock(&page_pools_lock); >> return -ENODATA; >> +#else >> + return 0; > > ...we can't assume success when we cannot check the memory provider. > The memory provider check is somewhat critical; we rely on it to > detect that the driver does not support memory providers or is not > doing the right thing, and report that to the user. I don't think we > can silently disable the check when the CONFIG_SYSFS is disabled. > Please return -ENODATA or some other error here. I initially returned 0 to have the same behaviour as when CONFIG_PAGE_POOL is not defined. But thanks to your explanations, I understand it seems better to return -ENODATA here. Or another errno, to let the userspace understanding there is a different error? I can send a v2 after the 24h rate-limit period if you are OK with that. > If we disable devmem TCP for !CONFIG_SYSFS we should probably add > something to the docs saying this. I can do that in a follow up > change. Good point, thank you. > However, I'm looking at the definition of get_netdev_rx_queue_index() > and at first glance I don't see anything there that is actually > dependent on CONFIG_SYSFS. Can we do this instead? I have build-tested > it and it resolves the build issue as well: > > ``` > diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h > index ac34f5fb4f71..596836abf7bf 100644 > --- a/include/net/netdev_rx_queue.h > +++ b/include/net/netdev_rx_queue.h > @@ -45,7 +45,6 @@ __netif_get_rx_queue(struct net_device *dev, unsigned int rxq) > return dev->_rx + rxq; > } > > -#ifdef CONFIG_SYSFS > static inline unsigned int > get_netdev_rx_queue_index(struct netdev_rx_queue *queue) > { > @@ -55,7 +54,6 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) > BUG_ON(index >= dev->num_rx_queues); > return index; > } > -#endif > ``` I briefly looked at taking this path when I saw what this helper was doing, but then I saw all operations related to the received queues were enabled only when CONFIG_SYSFS is set, see commit a953be53ce40 ("net-sysfs: add support for device-specific rx queue sysfs attributes"). I understood from that it is better not to look at dev->_rx or dev->num_rx_queues when CONFIG_SYSFS is not set. I'm not very familiar to that part of the code, but it feels like removing this #ifdef might be similar to the "return 0" I suggested: silently disabling the check, no? I *think* it might be clearer to return an error when SYSFS is not set. > Matthieu, I'm happy to follow up with v2 of this fix if you don't have time. If you prefer to explore other ways than returning an error in page_pool_check_memory_provider() when SYSFS is not set, yes please do the follow-up if you don't mind. My main goal is to stop my CI to complain about that when compiling with 'make tinyconfig' + MPTCP :) Cheers, Matt
On Thu, Sep 12, 2024 at 8:26 AM Matthieu Baerts <matttbe@kernel.org> wrote: > > Hi Mina, > > Thank you for your reply! > > On 12/09/2024 14:49, Mina Almasry wrote: > > On Thu, Sep 12, 2024 at 3:25 AM Matthieu Baerts (NGI0) > > <matttbe@kernel.org> wrote: > >> > >> When CONFIG_SYSFS is not set, the kernel fails to compile: > >> > >> net/core/page_pool_user.c:368:45: error: implicit declaration of function 'get_netdev_rx_queue_index' [-Werror=implicit-function-declaration] > >> 368 | if (pool->slow.queue_idx == get_netdev_rx_queue_index(rxq)) { > >> | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >> > >> When CONFIG_SYSFS is not set, get_netdev_rx_queue_index() is not defined > >> as well. In this case, page_pool_check_memory_provider() cannot check > >> the memory provider, and a success answer can be returned instead. > >> > > > > Thanks Matthieu, and sorry about that. > > > > I have reproduced the build error and the fix resolves it. But... > > > >> Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider") > >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > >> --- > >> net/core/page_pool_user.c | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c > >> index 48335766c1bf..a98c0a76b33f 100644 > >> --- a/net/core/page_pool_user.c > >> +++ b/net/core/page_pool_user.c > >> @@ -353,6 +353,7 @@ void page_pool_unlist(struct page_pool *pool) > >> int page_pool_check_memory_provider(struct net_device *dev, > >> struct netdev_rx_queue *rxq) > >> { > >> +#ifdef CONFIG_SYSFS > >> struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv; > >> struct page_pool *pool; > >> struct hlist_node *n; > >> @@ -372,6 +373,9 @@ int page_pool_check_memory_provider(struct net_device *dev, > >> } > >> mutex_unlock(&page_pools_lock); > >> return -ENODATA; > >> +#else > >> + return 0; > > > > ...we can't assume success when we cannot check the memory provider. > > The memory provider check is somewhat critical; we rely on it to > > detect that the driver does not support memory providers or is not > > doing the right thing, and report that to the user. I don't think we > > can silently disable the check when the CONFIG_SYSFS is disabled. > > Please return -ENODATA or some other error here. > > I initially returned 0 to have the same behaviour as when > CONFIG_PAGE_POOL is not defined. But thanks to your explanations, I > understand it seems better to return -ENODATA here. Or another errno, to > let the userspace understanding there is a different error? I can send a > v2 after the 24h rate-limit period if you are OK with that. > Yes, -EOPNOTSUPP would be my preference here. I think it makes sense, we should not support memory-providers on configs where core can't verify that the driver did the right thing. [...] > > However, I'm looking at the definition of get_netdev_rx_queue_index() > > and at first glance I don't see anything there that is actually > > dependent on CONFIG_SYSFS. Can we do this instead? I have build-tested > > it and it resolves the build issue as well: > > > > ``` > > diff --git a/include/net/netdev_rx_queue.h b/include/net/netdev_rx_queue.h > > index ac34f5fb4f71..596836abf7bf 100644 > > --- a/include/net/netdev_rx_queue.h > > +++ b/include/net/netdev_rx_queue.h > > @@ -45,7 +45,6 @@ __netif_get_rx_queue(struct net_device *dev, unsigned int rxq) > > return dev->_rx + rxq; > > } > > > > -#ifdef CONFIG_SYSFS > > static inline unsigned int > > get_netdev_rx_queue_index(struct netdev_rx_queue *queue) > > { > > @@ -55,7 +54,6 @@ get_netdev_rx_queue_index(struct netdev_rx_queue *queue) > > BUG_ON(index >= dev->num_rx_queues); > > return index; > > } > > -#endif > > ``` > > I briefly looked at taking this path when I saw what this helper was > doing, but then I saw all operations related to the received queues were > enabled only when CONFIG_SYSFS is set, see commit a953be53ce40 > ("net-sysfs: add support for device-specific rx queue sysfs > attributes"). I understood from that it is better not to look at > dev->_rx or dev->num_rx_queues when CONFIG_SYSFS is not set. I'm not > very familiar to that part of the code, but it feels like removing this > #ifdef might be similar to the "return 0" I suggested: silently > disabling the check, no? > > I *think* it might be clearer to return an error when SYSFS is not set. > FWIW it looks like commit e817f85652c1 ("xdp: generic XDP handling of xdp_rxq_info") reverted almost all the CONFIG_SYSFS checks set by commit a953be53ce40 ("net-sysfs: add support for device-specific rx queue sysfs attributes"), at least from a quick look. But I understand your CI is probably very annoyed by the build failure. I would be happy to reviewed-by a patch with just the change to the error return value, and I can look into making this work with CONFIG_SYSFS after the merge window.
On Thu, 12 Sep 2024 11:21:23 -0700 Mina Almasry wrote: > > I briefly looked at taking this path when I saw what this helper was > > doing, but then I saw all operations related to the received queues were > > enabled only when CONFIG_SYSFS is set, see commit a953be53ce40 > > ("net-sysfs: add support for device-specific rx queue sysfs > > attributes"). I understood from that it is better not to look at > > dev->_rx or dev->num_rx_queues when CONFIG_SYSFS is not set. I'm not > > very familiar to that part of the code, but it feels like removing this > > #ifdef might be similar to the "return 0" I suggested: silently > > disabling the check, no? > > > > I *think* it might be clearer to return an error when SYSFS is not set. > > > > FWIW it looks like commit e817f85652c1 ("xdp: generic XDP handling of > xdp_rxq_info") reverted almost all the CONFIG_SYSFS checks set by > commit a953be53ce40 ("net-sysfs: add support for device-specific rx > queue sysfs attributes"), at least from a quick look. That's right, just delete the ifdef. I should have done that when I moved the helper. Please send the fix ASAP.
diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c index 48335766c1bf..a98c0a76b33f 100644 --- a/net/core/page_pool_user.c +++ b/net/core/page_pool_user.c @@ -353,6 +353,7 @@ void page_pool_unlist(struct page_pool *pool) int page_pool_check_memory_provider(struct net_device *dev, struct netdev_rx_queue *rxq) { +#ifdef CONFIG_SYSFS struct net_devmem_dmabuf_binding *binding = rxq->mp_params.mp_priv; struct page_pool *pool; struct hlist_node *n; @@ -372,6 +373,9 @@ int page_pool_check_memory_provider(struct net_device *dev, } mutex_unlock(&page_pools_lock); return -ENODATA; +#else + return 0; +#endif } static void page_pool_unreg_netdev_wipe(struct net_device *netdev)
When CONFIG_SYSFS is not set, the kernel fails to compile: net/core/page_pool_user.c:368:45: error: implicit declaration of function 'get_netdev_rx_queue_index' [-Werror=implicit-function-declaration] 368 | if (pool->slow.queue_idx == get_netdev_rx_queue_index(rxq)) { | ^~~~~~~~~~~~~~~~~~~~~~~~~ When CONFIG_SYSFS is not set, get_netdev_rx_queue_index() is not defined as well. In this case, page_pool_check_memory_provider() cannot check the memory provider, and a success answer can be returned instead. Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/core/page_pool_user.c | 4 ++++ 1 file changed, 4 insertions(+) --- base-commit: 3cfb5aa10cb78571e214e48a3a6e42c11d5288a1 change-id: 20240912-net-next-fix-get_netdev_rx_queue_index-a1034d1b962a Best regards,