Message ID | f54ee9f30898b998edf8f07dabccc84efaa2ab8b.1644323503.git.alibuda@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | net/smc: Optimizing performance in short-lived scenarios | expand |
On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote: > From: "D. Wythe" <alibuda@linux.alibaba.com> > > @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb) > goto errattr; > if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true)) > goto errattr; > + if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback)) READ_ONCE(smc_auto_fallback) ? > + goto errattr; > smc_clc_get_hostname(&host); > if (host) { > memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN); > diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c > index f13ab06..a7de517 100644 > --- a/net/smc/smc_netlink.c > +++ b/net/smc/smc_netlink.c > @@ -111,6 +111,16 @@ > .flags = GENL_ADMIN_PERM, > .doit = smc_nl_disable_seid, > }, > + { > + .cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK, > + .flags = GENL_ADMIN_PERM, > + .doit = smc_enable_auto_fallback, > + }, > + { > + .cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK, > + .flags = GENL_ADMIN_PERM, > + .doit = smc_disable_auto_fallback, > + }, > }; Consider adding the separated cmd to query the status of this config, just as SEID does? It is common to check this value after user-space setted. Combined with sys_info maybe a little heavy in this scene. Thanks, Tony Lu
On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote: > From: "D. Wythe" <alibuda@linux.alibaba.com> > > Although we can control SMC auto fallback through socket options, which > means that applications who need it must modify their code. It's quite > troublesome for many existing applications. This patch modifies the > global default value of auto fallback through netlink, providing a way > to auto fallback without modifying any code for applications. > > Suggested-by: Tony Lu <tonylu@linux.alibaba.com> > Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> > --- > include/uapi/linux/smc.h | 3 +++ > net/smc/af_smc.c | 17 +++++++++++++++++ > net/smc/smc.h | 7 +++++++ > net/smc/smc_core.c | 2 ++ > net/smc/smc_netlink.c | 10 ++++++++++ > 5 files changed, 39 insertions(+) > > diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h > index 9f2cbf8..33f7fb8 100644 > --- a/include/uapi/linux/smc.h > +++ b/include/uapi/linux/smc.h > @@ -59,6 +59,8 @@ enum { > SMC_NETLINK_DUMP_SEID, > SMC_NETLINK_ENABLE_SEID, > SMC_NETLINK_DISABLE_SEID, > + SMC_NETLINK_ENABLE_AUTO_FALLBACK, > + SMC_NETLINK_DISABLE_AUTO_FALLBACK, > }; > > /* SMC_GENL_FAMILY top level attributes */ > @@ -85,6 +87,7 @@ enum { > SMC_NLA_SYS_LOCAL_HOST, /* string */ > SMC_NLA_SYS_SEID, /* string */ > SMC_NLA_SYS_IS_SMCR_V2, /* u8 */ > + SMC_NLA_SYS_AUTO_FALLBACK, /* u8 */ > __SMC_NLA_SYS_MAX, > SMC_NLA_SYS_MAX = __SMC_NLA_SYS_MAX - 1 > }; > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index c313561..4a25ce7 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -59,6 +59,8 @@ > * creation on client > */ > > +bool smc_auto_fallback; /* default behavior for auto fallback, disable by default */ SMC supports net namespace, it would be better to provide a per net-namespace switch. Generally, one container has one application, runs different workload that is different from others. So the behavior could be different, such as high-performance (don't fallback for limit) or TCP transparent replacement (limit if needed). Thank you, Tony Lu
I don't think this is necessary, since we already have socket options. Is there any scenario that the socket options and global switch can not cover? Thanks. 在 2022/2/9 下午5:33, Tony Lu 写道: > SMC supports net namespace, it would be better to provide a per > net-namespace switch. Generally, one container has one application, runs > different workload that is different from others. So the behavior could > be different, such as high-performance (don't fallback for limit) or TCP > transparent replacement (limit if needed).
在 2022/2/9 下午5:16, Tony Lu 写道: > On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote: >> From: "D. Wythe" <alibuda@linux.alibaba.com> >> >> @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb) >> goto errattr; >> if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true)) >> goto errattr; >> + if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback)) > > READ_ONCE(smc_auto_fallback) ? No READ_ONCE() will cause ? >> + goto errattr; >> smc_clc_get_hostname(&host); >> if (host) { >> memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN); >> diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c >> index f13ab06..a7de517 100644 >> --- a/net/smc/smc_netlink.c >> +++ b/net/smc/smc_netlink.c >> @@ -111,6 +111,16 @@ >> .flags = GENL_ADMIN_PERM, >> .doit = smc_nl_disable_seid, >> }, >> + { >> + .cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK, >> + .flags = GENL_ADMIN_PERM, >> + .doit = smc_enable_auto_fallback, >> + }, >> + { >> + .cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK, >> + .flags = GENL_ADMIN_PERM, >> + .doit = smc_disable_auto_fallback, >> + }, >> }; > > Consider adding the separated cmd to query the status of this config, > just as SEID does? > > It is common to check this value after user-space setted. Combined with > sys_info maybe a little heavy in this scene. Add a independent dumpit is quite okay, but is there have really scenarios that access this value frequently? With more and more such switches in the future, is is necessary for us to repeat on each switch ? I do have a plan to put them unified within a NLA attributes, but I don't have much time yet.
On Wed, Feb 09, 2022 at 05:41:50PM +0800, D. Wythe wrote: > I don't think this is necessary, since we already have socket options. Is > there any scenario that the socket options and global switch can not cover? > When transparently replacing the whole container's TCP connections, we cannot touch the user's application, and have to replace their connections to SMC. It is common for container environment, different containers will run different applications. Most of TCP knob is per net-namespace, it could be better for us to do it from the beginning. Thanks, Tony Lu
Copy that, there are indeed some problems for the container environment. I'll try work on it. Thanks. 在 2022/2/9 下午5:54, Tony Lu 写道: > On Wed, Feb 09, 2022 at 05:41:50PM +0800, D. Wythe wrote: >> I don't think this is necessary, since we already have socket options. Is >> there any scenario that the socket options and global switch can not cover? >> > When transparently replacing the whole container's TCP connections, we > cannot touch the user's application, and have to replace their > connections to SMC. It is common for container environment, different > containers will run different applications. > > Most of TCP knob is per net-namespace, it could be better for us to do > it from the beginning. > > Thanks, > Tony Lu
On Wed, Feb 09, 2022 at 05:53:18PM +0800, D. Wythe wrote: > > > 在 2022/2/9 下午5:16, Tony Lu 写道: > > On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote: > > > From: "D. Wythe" <alibuda@linux.alibaba.com> > > > > > > @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb) > > > goto errattr; > > > if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true)) > > > goto errattr; > > > + if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback)) > > > > READ_ONCE(smc_auto_fallback) ? > > > No READ_ONCE() will cause ? Make sure that we read the current value. > > > > + goto errattr; > > > smc_clc_get_hostname(&host); > > > if (host) { > > > memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN); > > > diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c > > > index f13ab06..a7de517 100644 > > > --- a/net/smc/smc_netlink.c > > > +++ b/net/smc/smc_netlink.c > > > @@ -111,6 +111,16 @@ > > > .flags = GENL_ADMIN_PERM, > > > .doit = smc_nl_disable_seid, > > > }, > > > + { > > > + .cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK, > > > + .flags = GENL_ADMIN_PERM, > > > + .doit = smc_enable_auto_fallback, > > > + }, > > > + { > > > + .cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK, > > > + .flags = GENL_ADMIN_PERM, > > > + .doit = smc_disable_auto_fallback, > > > + }, > > > }; > > > > Consider adding the separated cmd to query the status of this config, > > just as SEID does? > > > > It is common to check this value after user-space setted. Combined with > > sys_info maybe a little heavy in this scene. > > > Add a independent dumpit is quite okay, but is there have really scenarios > that access this value frequently? With more and more such switches in the > future, is is necessary for us to repeat on each switch ? I do have a plan > to put them unified within a NLA attributes, but I don't have much time yet. Yes, I think spreading them make code clean, and we can keep ABI compatibility if we have more than one interface. If we want to change one knob, we can change itself functions and data structures. Also, it makes userspace tools easy to maintainer. TCP's procfs, like /proc/net/netstat, is a summary knob, but not easy to parse and extend. Given that we choose modern netlink, we can avoid it from the beginning. Thanks, Tony Lu
diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h index 9f2cbf8..33f7fb8 100644 --- a/include/uapi/linux/smc.h +++ b/include/uapi/linux/smc.h @@ -59,6 +59,8 @@ enum { SMC_NETLINK_DUMP_SEID, SMC_NETLINK_ENABLE_SEID, SMC_NETLINK_DISABLE_SEID, + SMC_NETLINK_ENABLE_AUTO_FALLBACK, + SMC_NETLINK_DISABLE_AUTO_FALLBACK, }; /* SMC_GENL_FAMILY top level attributes */ @@ -85,6 +87,7 @@ enum { SMC_NLA_SYS_LOCAL_HOST, /* string */ SMC_NLA_SYS_SEID, /* string */ SMC_NLA_SYS_IS_SMCR_V2, /* u8 */ + SMC_NLA_SYS_AUTO_FALLBACK, /* u8 */ __SMC_NLA_SYS_MAX, SMC_NLA_SYS_MAX = __SMC_NLA_SYS_MAX - 1 }; diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index c313561..4a25ce7 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -59,6 +59,8 @@ * creation on client */ +bool smc_auto_fallback; /* default behavior for auto fallback, disable by default */ + static struct workqueue_struct *smc_tcp_ls_wq; /* wq for tcp listen work */ struct workqueue_struct *smc_hs_wq; /* wq for handshake work */ struct workqueue_struct *smc_close_wq; /* wq for close work */ @@ -66,6 +68,18 @@ static void smc_tcp_listen_work(struct work_struct *); static void smc_connect_work(struct work_struct *); +int smc_enable_auto_fallback(struct sk_buff *skb, struct genl_info *info) +{ + WRITE_ONCE(smc_auto_fallback, true); + return 0; +} + +int smc_disable_auto_fallback(struct sk_buff *skb, struct genl_info *info) +{ + WRITE_ONCE(smc_auto_fallback, false); + return 0; +} + static void smc_set_keepalive(struct sock *sk, int val) { struct smc_sock *smc = smc_sk(sk); @@ -3006,6 +3020,9 @@ static int __smc_create(struct net *net, struct socket *sock, int protocol, smc->use_fallback = false; /* assume rdma capability first */ smc->fallback_rsn = 0; + /* default behavior from smc_auto_fallback */ + smc->auto_fallback = READ_ONCE(smc_auto_fallback); + rc = 0; if (!clcsock) { rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP, diff --git a/net/smc/smc.h b/net/smc/smc.h index a0bdf75..ac75fe8 100644 --- a/net/smc/smc.h +++ b/net/smc/smc.h @@ -14,6 +14,7 @@ #include <linux/socket.h> #include <linux/types.h> #include <linux/compiler.h> /* __aligned */ +#include <net/genetlink.h> #include <net/sock.h> #include "smc_ib.h" @@ -336,4 +337,10 @@ void smc_fill_gid_list(struct smc_link_group *lgr, struct smc_gidlist *gidlist, struct smc_ib_device *known_dev, u8 *known_gid); +extern bool smc_auto_fallback; /* default behavior for auto fallback */ + +/* smc_auto_fallback setter for netlink */ +int smc_enable_auto_fallback(struct sk_buff *skb, struct genl_info *info); +int smc_disable_auto_fallback(struct sk_buff *skb, struct genl_info *info); + #endif /* __SMC_H */ diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 29525d0..cc9a398 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb) goto errattr; if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true)) goto errattr; + if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback)) + goto errattr; smc_clc_get_hostname(&host); if (host) { memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN); diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c index f13ab06..a7de517 100644 --- a/net/smc/smc_netlink.c +++ b/net/smc/smc_netlink.c @@ -111,6 +111,16 @@ .flags = GENL_ADMIN_PERM, .doit = smc_nl_disable_seid, }, + { + .cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK, + .flags = GENL_ADMIN_PERM, + .doit = smc_enable_auto_fallback, + }, + { + .cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK, + .flags = GENL_ADMIN_PERM, + .doit = smc_disable_auto_fallback, + }, }; static const struct nla_policy smc_gen_nl_policy[2] = {