Message ID | 20f504f961e1a803f85d64229ad84260434203bd.1644323503.git.alibuda@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/smc: Optimizing performance in short-lived scenarios | expand |
On 08/02/2022 13:53, D. Wythe wrote: > From: "D. Wythe" <alibuda@linux.alibaba.com> > > This patch aims to add dynamic control for SMC auto fallback, since we > don't have socket option level for SMC yet, which requires we need to > implement it at the same time. In your response to the v2 version of this series you wrote: > After some trial and thought, I found that the scope of netlink control > is too large, we should limit the scope to socket. Adding a socket option > may be a better choice, what do you think? I want to understand why this socket option is required, who needs it and why. What were your trials and thoughts, did you see any problems with the global switch via the netlink interface?
Some of our servers have different service types on different ports. A global switch cannot control different service ports individually in this case。In fact, it has nothing to do with using netlink or not. Socket options is the first solution comes to my mind in that case,I don't know if there is any other better way。 Looks for you suggestions. Thanks. 在 2022/2/9 上午1:08, Karsten Graul 写道: > On 08/02/2022 13:53, D. Wythe wrote: >> From: "D. Wythe" <alibuda@linux.alibaba.com> >> >> This patch aims to add dynamic control for SMC auto fallback, since we >> don't have socket option level for SMC yet, which requires we need to >> implement it at the same time. > > In your response to the v2 version of this series you wrote: > >> After some trial and thought, I found that the scope of netlink control >> is too large, we should limit the scope to socket. Adding a socket option >> may be a better choice, what do you think? > > I want to understand why this socket option is required, who needs it and why. > What were your trials and thoughts, did you see any problems with the global > switch via the netlink interface?
On 09/02/2022 07:41, D. Wythe wrote: > > Some of our servers have different service types on different ports. > A global switch cannot control different service ports individually in this case。In fact, it has nothing to do with using netlink or not. Socket options is the first solution comes to my mind in that case,I don't know if there is any other better way。 > I try to understand why you think it is needed to handle different service types differently. As you wrote > After some trial and thought, I found that the scope of netlink control is too large please explain what you found out. I don't doubt about netlink or socket option here, its all about why a global switch for this behavior isn't good enough.
When a large number of connections are influx, the long-connection service has a much higher tolerance for smc queuing time than the short-link service. For the long-connection service, more SMC connections are more important than faster connection establishment, the auto fallback is quite meaningless and unexpected to them, while the short-link connection service is in the opposite. When a host has both types of services below, a global switch cannot works in that case. what do you think? Hope for you reply. Thanks. 在 2022/2/9 下午3:59, Karsten Graul 写道: > On 09/02/2022 07:41, D. Wythe wrote: >> >> Some of our servers have different service types on different ports. >> A global switch cannot control different service ports individually in this case。In fact, it has nothing to do with using netlink or not. Socket options is the first solution comes to my mind in that case,I don't know if there is any other better way。 >> > > I try to understand why you think it is needed to handle different > service types differently. As you wrote > >> After some trial and thought, I found that the scope of netlink control is too large > > please explain what you found out. I don't doubt about netlink or socket option here, > its all about why a global switch for this behavior isn't good enough.
diff --git a/include/linux/socket.h b/include/linux/socket.h index 8ef26d8..6f85f5d 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -366,6 +366,7 @@ struct ucred { #define SOL_XDP 283 #define SOL_MPTCP 284 #define SOL_MCTP 285 +#define SOL_SMC 286 /* IPX options */ #define IPX_TYPE 1 diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h index 6c2874f..9f2cbf8 100644 --- a/include/uapi/linux/smc.h +++ b/include/uapi/linux/smc.h @@ -284,4 +284,8 @@ enum { __SMC_NLA_SEID_TABLE_MAX, SMC_NLA_SEID_TABLE_MAX = __SMC_NLA_SEID_TABLE_MAX - 1 }; + +/* SMC socket options */ +#define SMC_AUTO_FALLBACK 1 /* allow auto fallback to TCP */ + #endif /* _UAPI_LINUX_SMC_H */ diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 8175f60..c313561 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2325,7 +2325,8 @@ static int smc_listen(struct socket *sock, int backlog) inet_csk(smc->clcsock->sk)->icsk_af_ops = &smc->af_ops; - tcp_sk(smc->clcsock->sk)->smc_in_limited = smc_is_in_limited; + if (smc->auto_fallback) + tcp_sk(smc->clcsock->sk)->smc_in_limited = smc_is_in_limited; rc = kernel_listen(smc->clcsock, backlog); if (rc) { @@ -2620,6 +2621,67 @@ static int smc_shutdown(struct socket *sock, int how) return rc ? rc : rc1; } +static int __smc_getsockopt(struct socket *sock, int level, int optname, + char __user *optval, int __user *optlen) +{ + struct smc_sock *smc; + int val, len; + + smc = smc_sk(sock->sk); + + if (get_user(len, optlen)) + return -EFAULT; + + len = min_t(int, len, sizeof(int)); + + if (len < 0) + return -EINVAL; + + switch (optname) { + case SMC_AUTO_FALLBACK: + val = smc->auto_fallback; + break; + default: + return -EOPNOTSUPP; + } + + if (put_user(len, optlen)) + return -EFAULT; + if (copy_to_user(optval, &val, len)) + return -EFAULT; + + return 0; +} + +static int __smc_setsockopt(struct socket *sock, int level, int optname, + sockptr_t optval, unsigned int optlen) +{ + struct sock *sk = sock->sk; + struct smc_sock *smc; + int val, rc; + + smc = smc_sk(sk); + + lock_sock(sk); + switch (optname) { + case SMC_AUTO_FALLBACK: + if (optlen < sizeof(int)) + return -EINVAL; + if (copy_from_sockptr(&val, optval, sizeof(int))) + return -EFAULT; + + smc->auto_fallback = !!val; + rc = 0; + break; + default: + rc = -EOPNOTSUPP; + break; + } + release_sock(sk); + + return rc; +} + static int smc_setsockopt(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen) { @@ -2629,6 +2691,8 @@ static int smc_setsockopt(struct socket *sock, int level, int optname, if (level == SOL_TCP && optname == TCP_ULP) return -EOPNOTSUPP; + else if (level == SOL_SMC) + return __smc_setsockopt(sock, level, optname, optval, optlen); smc = smc_sk(sk); @@ -2711,6 +2775,9 @@ static int smc_getsockopt(struct socket *sock, int level, int optname, struct smc_sock *smc; int rc; + if (level == SOL_SMC) + return __smc_getsockopt(sock, level, optname, optval, optlen); + smc = smc_sk(sock->sk); mutex_lock(&smc->clcsock_release_lock); if (!smc->clcsock) { diff --git a/net/smc/smc.h b/net/smc/smc.h index 5e5e38d..a0bdf75 100644 --- a/net/smc/smc.h +++ b/net/smc/smc.h @@ -249,6 +249,7 @@ struct smc_sock { /* smc sock container */ struct work_struct smc_listen_work;/* prepare new accept socket */ struct list_head accept_q; /* sockets to be accepted */ spinlock_t accept_q_lock; /* protects accept_q */ + bool auto_fallback; /* auto fallabck to tcp */ bool use_fallback; /* fallback to tcp */ int fallback_rsn; /* reason for fallback */ u32 peer_diagnosis; /* decline reason from peer */