diff mbox series

[net-next,v5,5/5] net/smc: Add global configure for auto fallback by netlink

Message ID f54ee9f30898b998edf8f07dabccc84efaa2ab8b.1644323503.git.alibuda@linux.alibaba.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net/smc: Optimizing performance in short-lived scenarios | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 1 maintainers not CCed: guvenc@linux.ibm.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

D. Wythe Feb. 8, 2022, 12:53 p.m. UTC
From: "D. Wythe" <alibuda@linux.alibaba.com>

Although we can control SMC auto fallback through socket options, which
means that applications who need it must modify their code. It's quite
troublesome for many existing applications. This patch modifies the
global default value of auto fallback through netlink, providing a way
to auto fallback without modifying any code for applications.

Suggested-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 include/uapi/linux/smc.h |  3 +++
 net/smc/af_smc.c         | 17 +++++++++++++++++
 net/smc/smc.h            |  7 +++++++
 net/smc/smc_core.c       |  2 ++
 net/smc/smc_netlink.c    | 10 ++++++++++
 5 files changed, 39 insertions(+)

Comments

Tony Lu Feb. 9, 2022, 9:16 a.m. UTC | #1
On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb)
>  		goto errattr;
>  	if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true))
>  		goto errattr;
> +	if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback))

READ_ONCE(smc_auto_fallback) ?

> +		goto errattr;
>  	smc_clc_get_hostname(&host);
>  	if (host) {
>  		memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN);
> diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c
> index f13ab06..a7de517 100644
> --- a/net/smc/smc_netlink.c
> +++ b/net/smc/smc_netlink.c
> @@ -111,6 +111,16 @@
>  		.flags = GENL_ADMIN_PERM,
>  		.doit = smc_nl_disable_seid,
>  	},
> +	{
> +		.cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK,
> +		.flags = GENL_ADMIN_PERM,
> +		.doit = smc_enable_auto_fallback,
> +	},
> +	{
> +		.cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK,
> +		.flags = GENL_ADMIN_PERM,
> +		.doit = smc_disable_auto_fallback,
> +	},
>  };

Consider adding the separated cmd to query the status of this config,
just as SEID does?

It is common to check this value after user-space setted. Combined with
sys_info maybe a little heavy in this scene.

Thanks,
Tony Lu
Tony Lu Feb. 9, 2022, 9:33 a.m. UTC | #2
On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> Although we can control SMC auto fallback through socket options, which
> means that applications who need it must modify their code. It's quite
> troublesome for many existing applications. This patch modifies the
> global default value of auto fallback through netlink, providing a way
> to auto fallback without modifying any code for applications.
> 
> Suggested-by: Tony Lu <tonylu@linux.alibaba.com>
> Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
> ---
>  include/uapi/linux/smc.h |  3 +++
>  net/smc/af_smc.c         | 17 +++++++++++++++++
>  net/smc/smc.h            |  7 +++++++
>  net/smc/smc_core.c       |  2 ++
>  net/smc/smc_netlink.c    | 10 ++++++++++
>  5 files changed, 39 insertions(+)
> 
> diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h
> index 9f2cbf8..33f7fb8 100644
> --- a/include/uapi/linux/smc.h
> +++ b/include/uapi/linux/smc.h
> @@ -59,6 +59,8 @@ enum {
>  	SMC_NETLINK_DUMP_SEID,
>  	SMC_NETLINK_ENABLE_SEID,
>  	SMC_NETLINK_DISABLE_SEID,
> +	SMC_NETLINK_ENABLE_AUTO_FALLBACK,
> +	SMC_NETLINK_DISABLE_AUTO_FALLBACK,
>  };
>  
>  /* SMC_GENL_FAMILY top level attributes */
> @@ -85,6 +87,7 @@ enum {
>  	SMC_NLA_SYS_LOCAL_HOST,		/* string */
>  	SMC_NLA_SYS_SEID,		/* string */
>  	SMC_NLA_SYS_IS_SMCR_V2,		/* u8 */
> +	SMC_NLA_SYS_AUTO_FALLBACK,	/* u8 */
>  	__SMC_NLA_SYS_MAX,
>  	SMC_NLA_SYS_MAX = __SMC_NLA_SYS_MAX - 1
>  };
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index c313561..4a25ce7 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -59,6 +59,8 @@
>  						 * creation on client
>  						 */
>  
> +bool smc_auto_fallback;	/* default behavior for auto fallback, disable by default */

SMC supports net namespace, it would be better to provide a per
net-namespace switch. Generally, one container has one application, runs
different workload that is different from others. So the behavior could
be different, such as high-performance (don't fallback for limit) or TCP
transparent replacement (limit if needed).

Thank you,
Tony Lu
D. Wythe Feb. 9, 2022, 9:41 a.m. UTC | #3
I don't think this is necessary, since we already have socket options. 
Is there any scenario that the socket options and global switch can not 
cover?

Thanks.

在 2022/2/9 下午5:33, Tony Lu 写道:
> SMC supports net namespace, it would be better to provide a per
> net-namespace switch. Generally, one container has one application, runs
> different workload that is different from others. So the behavior could
> be different, such as high-performance (don't fallback for limit) or TCP
> transparent replacement (limit if needed).
D. Wythe Feb. 9, 2022, 9:53 a.m. UTC | #4
在 2022/2/9 下午5:16, Tony Lu 写道:
> On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote:
>> From: "D. Wythe" <alibuda@linux.alibaba.com>
>>
>> @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb)
>>   		goto errattr;
>>   	if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true))
>>   		goto errattr;
>> +	if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback))
> 
> READ_ONCE(smc_auto_fallback) ?


No READ_ONCE() will cause ?


>> +		goto errattr;
>>   	smc_clc_get_hostname(&host);
>>   	if (host) {
>>   		memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN);
>> diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c
>> index f13ab06..a7de517 100644
>> --- a/net/smc/smc_netlink.c
>> +++ b/net/smc/smc_netlink.c
>> @@ -111,6 +111,16 @@
>>   		.flags = GENL_ADMIN_PERM,
>>   		.doit = smc_nl_disable_seid,
>>   	},
>> +	{
>> +		.cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK,
>> +		.flags = GENL_ADMIN_PERM,
>> +		.doit = smc_enable_auto_fallback,
>> +	},
>> +	{
>> +		.cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK,
>> +		.flags = GENL_ADMIN_PERM,
>> +		.doit = smc_disable_auto_fallback,
>> +	},
>>   };
> 
> Consider adding the separated cmd to query the status of this config,
> just as SEID does?
> 
> It is common to check this value after user-space setted. Combined with
> sys_info maybe a little heavy in this scene.


Add a independent dumpit is quite okay, but is there have really 
scenarios that access this value frequently? With more and more such 
switches in the future, is is necessary for us to repeat on each switch 
? I do have a plan to put them unified within a NLA attributes, but I 
don't have much time yet.
Tony Lu Feb. 9, 2022, 9:54 a.m. UTC | #5
On Wed, Feb 09, 2022 at 05:41:50PM +0800, D. Wythe wrote:
> I don't think this is necessary, since we already have socket options. Is
> there any scenario that the socket options and global switch can not cover?
> 
When transparently replacing the whole container's TCP connections, we
cannot touch the user's application, and have to replace their
connections to SMC. It is common for container environment, different
containers will run different applications.

Most of TCP knob is per net-namespace, it could be better for us to do
it from the beginning.

Thanks,
Tony Lu
D. Wythe Feb. 9, 2022, 10:56 a.m. UTC | #6
Copy that, there are indeed some problems for the container environment.
I'll try work on it.

Thanks.

在 2022/2/9 下午5:54, Tony Lu 写道:
> On Wed, Feb 09, 2022 at 05:41:50PM +0800, D. Wythe wrote:
>> I don't think this is necessary, since we already have socket options. Is
>> there any scenario that the socket options and global switch can not cover?
>>
> When transparently replacing the whole container's TCP connections, we
> cannot touch the user's application, and have to replace their
> connections to SMC. It is common for container environment, different
> containers will run different applications.
> 
> Most of TCP knob is per net-namespace, it could be better for us to do
> it from the beginning.
> 
> Thanks,
> Tony Lu
Tony Lu Feb. 9, 2022, 11:37 a.m. UTC | #7
On Wed, Feb 09, 2022 at 05:53:18PM +0800, D. Wythe wrote:
> 
> 
> 在 2022/2/9 下午5:16, Tony Lu 写道:
> > On Tue, Feb 08, 2022 at 08:53:13PM +0800, D. Wythe wrote:
> > > From: "D. Wythe" <alibuda@linux.alibaba.com>
> > > 
> > > @@ -248,6 +248,8 @@ int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb)
> > >   		goto errattr;
> > >   	if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true))
> > >   		goto errattr;
> > > +	if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback))
> > 
> > READ_ONCE(smc_auto_fallback) ?
> 
> 
> No READ_ONCE() will cause ?

Make sure that we read the current value.
 
> 
> > > +		goto errattr;
> > >   	smc_clc_get_hostname(&host);
> > >   	if (host) {
> > >   		memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN);
> > > diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c
> > > index f13ab06..a7de517 100644
> > > --- a/net/smc/smc_netlink.c
> > > +++ b/net/smc/smc_netlink.c
> > > @@ -111,6 +111,16 @@
> > >   		.flags = GENL_ADMIN_PERM,
> > >   		.doit = smc_nl_disable_seid,
> > >   	},
> > > +	{
> > > +		.cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK,
> > > +		.flags = GENL_ADMIN_PERM,
> > > +		.doit = smc_enable_auto_fallback,
> > > +	},
> > > +	{
> > > +		.cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK,
> > > +		.flags = GENL_ADMIN_PERM,
> > > +		.doit = smc_disable_auto_fallback,
> > > +	},
> > >   };
> > 
> > Consider adding the separated cmd to query the status of this config,
> > just as SEID does?
> > 
> > It is common to check this value after user-space setted. Combined with
> > sys_info maybe a little heavy in this scene.
> 
> 
> Add a independent dumpit is quite okay, but is there have really scenarios
> that access this value frequently? With more and more such switches in the
> future, is is necessary for us to repeat on each switch ? I do have a plan
> to put them unified within a NLA attributes, but I don't have much time yet.

Yes, I think spreading them make code clean, and we can keep ABI
compatibility if we have more than one interface. If we want to change
one knob, we can change itself functions and data structures. Also, it
makes userspace tools easy to maintainer. TCP's procfs, like /proc/net/netstat,
is a summary knob, but not easy to parse and extend. Given that we
choose modern netlink, we can avoid it from the beginning.

Thanks,
Tony Lu
diff mbox series

Patch

diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h
index 9f2cbf8..33f7fb8 100644
--- a/include/uapi/linux/smc.h
+++ b/include/uapi/linux/smc.h
@@ -59,6 +59,8 @@  enum {
 	SMC_NETLINK_DUMP_SEID,
 	SMC_NETLINK_ENABLE_SEID,
 	SMC_NETLINK_DISABLE_SEID,
+	SMC_NETLINK_ENABLE_AUTO_FALLBACK,
+	SMC_NETLINK_DISABLE_AUTO_FALLBACK,
 };
 
 /* SMC_GENL_FAMILY top level attributes */
@@ -85,6 +87,7 @@  enum {
 	SMC_NLA_SYS_LOCAL_HOST,		/* string */
 	SMC_NLA_SYS_SEID,		/* string */
 	SMC_NLA_SYS_IS_SMCR_V2,		/* u8 */
+	SMC_NLA_SYS_AUTO_FALLBACK,	/* u8 */
 	__SMC_NLA_SYS_MAX,
 	SMC_NLA_SYS_MAX = __SMC_NLA_SYS_MAX - 1
 };
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index c313561..4a25ce7 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -59,6 +59,8 @@ 
 						 * creation on client
 						 */
 
+bool smc_auto_fallback;	/* default behavior for auto fallback, disable by default */
+
 static struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
 struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
 struct workqueue_struct	*smc_close_wq;	/* wq for close work */
@@ -66,6 +68,18 @@ 
 static void smc_tcp_listen_work(struct work_struct *);
 static void smc_connect_work(struct work_struct *);
 
+int smc_enable_auto_fallback(struct sk_buff *skb, struct genl_info *info)
+{
+	WRITE_ONCE(smc_auto_fallback, true);
+	return 0;
+}
+
+int smc_disable_auto_fallback(struct sk_buff *skb, struct genl_info *info)
+{
+	WRITE_ONCE(smc_auto_fallback, false);
+	return 0;
+}
+
 static void smc_set_keepalive(struct sock *sk, int val)
 {
 	struct smc_sock *smc = smc_sk(sk);
@@ -3006,6 +3020,9 @@  static int __smc_create(struct net *net, struct socket *sock, int protocol,
 	smc->use_fallback = false; /* assume rdma capability first */
 	smc->fallback_rsn = 0;
 
+	/* default behavior from smc_auto_fallback */
+	smc->auto_fallback = READ_ONCE(smc_auto_fallback);
+
 	rc = 0;
 	if (!clcsock) {
 		rc = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP,
diff --git a/net/smc/smc.h b/net/smc/smc.h
index a0bdf75..ac75fe8 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -14,6 +14,7 @@ 
 #include <linux/socket.h>
 #include <linux/types.h>
 #include <linux/compiler.h> /* __aligned */
+#include <net/genetlink.h>
 #include <net/sock.h>
 
 #include "smc_ib.h"
@@ -336,4 +337,10 @@  void smc_fill_gid_list(struct smc_link_group *lgr,
 		       struct smc_gidlist *gidlist,
 		       struct smc_ib_device *known_dev, u8 *known_gid);
 
+extern bool smc_auto_fallback; /* default behavior for auto fallback */
+
+/* smc_auto_fallback setter for netlink */
+int smc_enable_auto_fallback(struct sk_buff *skb, struct genl_info *info);
+int smc_disable_auto_fallback(struct sk_buff *skb, struct genl_info *info);
+
 #endif	/* __SMC_H */
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 29525d0..cc9a398 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -248,6 +248,8 @@  int smc_nl_get_sys_info(struct sk_buff *skb, struct netlink_callback *cb)
 		goto errattr;
 	if (nla_put_u8(skb, SMC_NLA_SYS_IS_SMCR_V2, true))
 		goto errattr;
+	if (nla_put_u8(skb, SMC_NLA_SYS_AUTO_FALLBACK, smc_auto_fallback))
+		goto errattr;
 	smc_clc_get_hostname(&host);
 	if (host) {
 		memcpy(hostname, host, SMC_MAX_HOSTNAME_LEN);
diff --git a/net/smc/smc_netlink.c b/net/smc/smc_netlink.c
index f13ab06..a7de517 100644
--- a/net/smc/smc_netlink.c
+++ b/net/smc/smc_netlink.c
@@ -111,6 +111,16 @@ 
 		.flags = GENL_ADMIN_PERM,
 		.doit = smc_nl_disable_seid,
 	},
+	{
+		.cmd = SMC_NETLINK_ENABLE_AUTO_FALLBACK,
+		.flags = GENL_ADMIN_PERM,
+		.doit = smc_enable_auto_fallback,
+	},
+	{
+		.cmd = SMC_NETLINK_DISABLE_AUTO_FALLBACK,
+		.flags = GENL_ADMIN_PERM,
+		.doit = smc_disable_auto_fallback,
+	},
 };
 
 static const struct nla_policy smc_gen_nl_policy[2] = {