Message ID | abb720b34b9eef1cc52ef68017334e27a2af83c6.1629840814.git.cdleonard@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | tcp: Initial support for RFC5925 auth option | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Guessed tree name to be net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 6 of 6 maintainers |
netdev/source_inline | fail | Was 0 now: 1 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | fail | Errors and warnings before: 19 this patch: 26 |
netdev/kdoc | fail | Errors and warnings before: 14 this patch: 15 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: line length of 84 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns WARNING: line length of 98 exceeds 80 columns |
netdev/build_allmodconfig_warn | fail | Errors and warnings before: 19 this patch: 26 |
netdev/header_inline | success | Link |
On 8/24/21 2:34 PM, Leonard Crestez wrote: > The crypto_shash API is used in order to compute packet signatures. The > API comes with several unfortunate limitations: > > 1) Allocating a crypto_shash can sleep and must be done in user context. > 2) Packet signatures must be computed in softirq context > 3) Packet signatures use dynamic "traffic keys" which require exclusive > access to crypto_shash for crypto_setkey. > > The solution is to allocate one crypto_shash for each possible cpu for > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > softirq context, signatures are computed and the tfm is returned. > > The pool for each algorithm is reference counted, initialized at > setsockopt time and released in tcp_authopt_key_info's rcu callback > > I don't know, why should we really care and try so hard to release the tfm per cpu ? I would simply allocate them at boot time. This would avoid the expensive refcounting (potential false sharing)
On 8/24/21 2:34 PM, Leonard Crestez wrote: > The crypto_shash API is used in order to compute packet signatures. The > API comes with several unfortunate limitations: > > 1) Allocating a crypto_shash can sleep and must be done in user context. > 2) Packet signatures must be computed in softirq context > 3) Packet signatures use dynamic "traffic keys" which require exclusive > access to crypto_shash for crypto_setkey. > > The solution is to allocate one crypto_shash for each possible cpu for > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > softirq context, signatures are computed and the tfm is returned. > I could not see the per-cpu stuff that you mention in the changelog.
On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote: > > On 8/24/21 2:34 PM, Leonard Crestez wrote: > > The crypto_shash API is used in order to compute packet signatures. The > > API comes with several unfortunate limitations: > > > > 1) Allocating a crypto_shash can sleep and must be done in user context. > > 2) Packet signatures must be computed in softirq context > > 3) Packet signatures use dynamic "traffic keys" which require exclusive > > access to crypto_shash for crypto_setkey. > > > > The solution is to allocate one crypto_shash for each possible cpu for > > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > > softirq context, signatures are computed and the tfm is returned. > > > > I could not see the per-cpu stuff that you mention in the changelog. Perhaps it's time we moved the key information from the tfm into the request structure for hashes? Or at least provide a way for the key to be in the request structure in addition to the tfm as the tfm model still works for IPsec. Ard/Eric, what do you think about that? Thanks,
On Wed, Aug 25, 2021 at 1:08 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote: > > > > On 8/24/21 2:34 PM, Leonard Crestez wrote: > > > The crypto_shash API is used in order to compute packet signatures. The > > > API comes with several unfortunate limitations: > > > > > > 1) Allocating a crypto_shash can sleep and must be done in user context. > > > 2) Packet signatures must be computed in softirq context > > > 3) Packet signatures use dynamic "traffic keys" which require exclusive > > > access to crypto_shash for crypto_setkey. > > > > > > The solution is to allocate one crypto_shash for each possible cpu for > > > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > > > softirq context, signatures are computed and the tfm is returned. > > > > > > > I could not see the per-cpu stuff that you mention in the changelog. > > Perhaps it's time we moved the key information from the tfm into > the request structure for hashes? Or at least provide a way for > the key to be in the request structure in addition to the tfm as > the tfm model still works for IPsec. Ard/Eric, what do you think > about that? What is the typical size of a ' tfm' and associated data ? per-cpu tfm might still make sense, if we had proper NUMA affinities. AFAIK, currently we can not provide a numa node to crypto allocations. So using construct like this ends up allocating all data on one single NUMA node for_each_possible_cpu(cpu) { tfm = crypto_alloc_shash(algo->name, 0, 0); if (IS_ERR(tfm)) return PTR_ERR(tfm); p_tfm = per_cpu_ptr(algo->tfms, cpu); *p_tfm = tfm; }
On Wed, 25 Aug 2021 at 10:08, Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote: > > > > On 8/24/21 2:34 PM, Leonard Crestez wrote: > > > The crypto_shash API is used in order to compute packet signatures. The > > > API comes with several unfortunate limitations: > > > > > > 1) Allocating a crypto_shash can sleep and must be done in user context. > > > 2) Packet signatures must be computed in softirq context > > > 3) Packet signatures use dynamic "traffic keys" which require exclusive > > > access to crypto_shash for crypto_setkey. > > > > > > The solution is to allocate one crypto_shash for each possible cpu for > > > each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > > > softirq context, signatures are computed and the tfm is returned. > > > > > > > I could not see the per-cpu stuff that you mention in the changelog. > > Perhaps it's time we moved the key information from the tfm into > the request structure for hashes? Or at least provide a way for > the key to be in the request structure in addition to the tfm as > the tfm model still works for IPsec. Ard/Eric, what do you think > about that? > I think it makes sense for a shash desc to have the ability to carry a key, which will be used instead of the TFM key, but this seems like quite a lot of work, given that all implementations will need to be updated. Also, setkey() can currently sleep, so we need to check whether the existing key manipulation code can actually execute during init/update/final if sleeping is not permitted.
On 8/25/21 7:04 PM, Ard Biesheuvel wrote: > On Wed, 25 Aug 2021 at 10:08, Herbert Xu <herbert@gondor.apana.org.au> wrote: >> >> On Tue, Aug 24, 2021 at 04:34:58PM -0700, Eric Dumazet wrote: >>> >>> On 8/24/21 2:34 PM, Leonard Crestez wrote: >>>> The crypto_shash API is used in order to compute packet signatures. The >>>> API comes with several unfortunate limitations: >>>> >>>> 1) Allocating a crypto_shash can sleep and must be done in user context. >>>> 2) Packet signatures must be computed in softirq context >>>> 3) Packet signatures use dynamic "traffic keys" which require exclusive >>>> access to crypto_shash for crypto_setkey. >>>> >>>> The solution is to allocate one crypto_shash for each possible cpu for >>>> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from >>>> softirq context, signatures are computed and the tfm is returned. >>>> >>> >>> I could not see the per-cpu stuff that you mention in the changelog. >> >> Perhaps it's time we moved the key information from the tfm into >> the request structure for hashes? Or at least provide a way for >> the key to be in the request structure in addition to the tfm as >> the tfm model still works for IPsec. Ard/Eric, what do you think >> about that? >> > > I think it makes sense for a shash desc to have the ability to carry a > key, which will be used instead of the TFM key, but this seems like > quite a lot of work, given that all implementations will need to be > updated. Also, setkey() can currently sleep, so we need to check > whether the existing key manipulation code can actually execute during > init/update/final if sleeping is not permitted. Are you sure that setkey can sleep? The documentation is not clear, maybe it only applies to certain hardware implementations? The TCP Authentication Option needs dynamic keys for SYN and SYNACK packets, all of which happens in BH context. -- Regards, Leonard
On 25.08.2021 02:34, Eric Dumazet wrote: > On 8/24/21 2:34 PM, Leonard Crestez wrote: >> The crypto_shash API is used in order to compute packet signatures. The >> API comes with several unfortunate limitations: >> >> 1) Allocating a crypto_shash can sleep and must be done in user context. >> 2) Packet signatures must be computed in softirq context >> 3) Packet signatures use dynamic "traffic keys" which require exclusive >> access to crypto_shash for crypto_setkey. >> >> The solution is to allocate one crypto_shash for each possible cpu for >> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from >> softirq context, signatures are computed and the tfm is returned. >> > > I could not see the per-cpu stuff that you mention in the changelog. That's a little embarrasing, I forgot to implement the actual per-cpu stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and tcp_authopt_alg_get_tfm needs no locking other than preempt_disable (which should already be the case). The reference counting would still only happen from very few places: setsockopt, close and openreq. This would only impact request/response traffic and relatively little. Performance was not a major focus so far. Preventing impact on non-AO connections is important but typical AO usecases are long-lived low-traffic connections. -- Regards, Leonard
On Wed, Aug 25, 2021 at 9:35 AM Leonard Crestez <cdleonard@gmail.com> wrote: > > On 25.08.2021 02:34, Eric Dumazet wrote: > > On 8/24/21 2:34 PM, Leonard Crestez wrote: > >> The crypto_shash API is used in order to compute packet signatures. The > >> API comes with several unfortunate limitations: > >> > >> 1) Allocating a crypto_shash can sleep and must be done in user context. > >> 2) Packet signatures must be computed in softirq context > >> 3) Packet signatures use dynamic "traffic keys" which require exclusive > >> access to crypto_shash for crypto_setkey. > >> > >> The solution is to allocate one crypto_shash for each possible cpu for > >> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from > >> softirq context, signatures are computed and the tfm is returned. > >> > > > > I could not see the per-cpu stuff that you mention in the changelog. > > That's a little embarrasing, I forgot to implement the actual per-cpu > stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and > tcp_authopt_alg_get_tfm needs no locking other than preempt_disable > (which should already be the case). Well, do not use arrays of NR_CPUS and instead use normal per_cpu accessors (as in __tcp_alloc_md5sig_pool) > > The reference counting would still only happen from very few places: > setsockopt, close and openreq. This would only impact request/response > traffic and relatively little. What I meant is that __tcp_alloc_md5sig_pool() allocates stuff one time, we do not care about tcp_md5sig_pool_populated going back to false. Otherwise, a single user application constantly allocating a socket, enabling MD5 (or authopt), then closing the socket would incur a big cost on hosts with a lot of cpus. > > Performance was not a major focus so far. Preventing impact on non-AO > connections is important but typical AO usecases are long-lived > low-traffic connections. > > -- > Regards, > Leonard
On 8/25/21 8:55 PM, Eric Dumazet wrote: > On Wed, Aug 25, 2021 at 9:35 AM Leonard Crestez <cdleonard@gmail.com> wrote: >> >> On 25.08.2021 02:34, Eric Dumazet wrote: >>> On 8/24/21 2:34 PM, Leonard Crestez wrote: >>>> The crypto_shash API is used in order to compute packet signatures. The >>>> API comes with several unfortunate limitations: >>>> >>>> 1) Allocating a crypto_shash can sleep and must be done in user context. >>>> 2) Packet signatures must be computed in softirq context >>>> 3) Packet signatures use dynamic "traffic keys" which require exclusive >>>> access to crypto_shash for crypto_setkey. >>>> >>>> The solution is to allocate one crypto_shash for each possible cpu for >>>> each algorithm at setsockopt time. The per-cpu tfm is then borrowed from >>>> softirq context, signatures are computed and the tfm is returned. >>>> >>> >>> I could not see the per-cpu stuff that you mention in the changelog. >> >> That's a little embarrasing, I forgot to implement the actual per-cpu >> stuff. tcp_authopt_alg_imp.tfm is meant to be an array up to NR_CPUS and >> tcp_authopt_alg_get_tfm needs no locking other than preempt_disable >> (which should already be the case). > > Well, do not use arrays of NR_CPUS and instead use normal per_cpu > accessors (as in __tcp_alloc_md5sig_pool) > >> >> The reference counting would still only happen from very few places: >> setsockopt, close and openreq. This would only impact request/response >> traffic and relatively little. > > What I meant is that __tcp_alloc_md5sig_pool() allocates stuff one time, > we do not care about tcp_md5sig_pool_populated going back to false. > > Otherwise, a single user application constantly allocating a socket, > enabling MD5 (or authopt), then closing the socket would incur > a big cost on hosts with a lot of cpus. Allocating only once would definitely simply things. I don't know if this might end up tying hardware resources forever if some accelerators are in play but for this feature software-only crypto is perfectly fine. -- Regards, Leonard
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h index b4277112b506..c9ee2059b442 100644 --- a/include/net/tcp_authopt.h +++ b/include/net/tcp_authopt.h @@ -2,10 +2,12 @@ #ifndef _LINUX_TCP_AUTHOPT_H #define _LINUX_TCP_AUTHOPT_H #include <uapi/linux/tcp.h> +struct tcp_authopt_alg_imp; + /** * struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925 * * Key structure lifetime is only protected by RCU so readers needs to hold a * single rcu_read_lock until they're done with the key. @@ -20,10 +22,11 @@ struct tcp_authopt_key_info { u8 send_id, recv_id; u8 alg_id; u8 keylen; u8 key[TCP_AUTHOPT_MAXKEYLEN]; struct sockaddr_storage addr; + struct tcp_authopt_alg_imp *alg; }; /** * struct tcp_authopt_info - Per-socket information regarding tcp_authopt * diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c index f6dddc5775ff..ce560bd88903 100644 --- a/net/ipv4/tcp_authopt.c +++ b/net/ipv4/tcp_authopt.c @@ -4,10 +4,161 @@ #include <net/tcp.h> #include <net/tcp_authopt.h> #include <crypto/hash.h> #include <trace/events/tcp.h> +/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */ +#define TCP_AUTHOPT_MAXMACBUF 20 +#define TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN 20 + +struct tcp_authopt_alg_imp { + /* Name of algorithm in crypto-api */ + const char *alg_name; + /* One of the TCP_AUTHOPT_ALG_* constants from uapi */ + u8 alg_id; + /* Length of traffic key */ + u8 traffic_key_len; + /* Length of mac in TCP option */ + u8 maclen; + + /* shared crypto_shash */ + spinlock_t lock; + int ref_cnt; + struct crypto_shash *tfm; +}; + +static struct tcp_authopt_alg_imp tcp_authopt_alg_list[] = { + { + .alg_id = TCP_AUTHOPT_ALG_HMAC_SHA_1_96, + .alg_name = "hmac(sha1)", + .traffic_key_len = 20, + .maclen = 12, + .lock = __SPIN_LOCK_UNLOCKED(tcp_authopt_alg_list[0].lock), + }, + { + .alg_id = TCP_AUTHOPT_ALG_AES_128_CMAC_96, + .alg_name = "cmac(aes)", + .traffic_key_len = 16, + .maclen = 12, + .lock = __SPIN_LOCK_UNLOCKED(tcp_authopt_alg_list[1].lock), + }, +}; + +/* get a pointer to the tcp_authopt_alg instance or NULL if id invalid */ +static inline struct tcp_authopt_alg_imp *tcp_authopt_alg_get(int alg_num) +{ + if (alg_num <= 0 || alg_num > 2) + return NULL; + return &tcp_authopt_alg_list[alg_num - 1]; +} + +/* Mark an algorithm as in-use from user context */ +static int tcp_authopt_alg_require(struct tcp_authopt_alg_imp *alg) +{ + struct crypto_shash *tfm = NULL; + bool need_init = false; + + might_sleep(); + + /* If we're the first user then we need to initialize shash but we might lose the race. */ + spin_lock_bh(&alg->lock); + WARN_ON(alg->ref_cnt < 0); + if (alg->ref_cnt == 0) + need_init = true; + else + ++alg->ref_cnt; + spin_unlock_bh(&alg->lock); + + /* Already initialized */ + if (!need_init) + return 0; + + tfm = crypto_alloc_shash(alg->alg_name, 0, 0); + if (IS_ERR(tfm)) + return PTR_ERR(tfm); + + spin_lock_bh(&alg->lock); + if (alg->ref_cnt == 0) + /* race won */ + alg->tfm = tfm; + else + /* race lost, free tfm later */ + need_init = false; + ++alg->ref_cnt; + spin_unlock_bh(&alg->lock); + + if (!need_init) + crypto_free_shash(tfm); + else + pr_info("initialized tcp-ao %s", alg->alg_name); + + return 0; +} + +static void tcp_authopt_alg_release(struct tcp_authopt_alg_imp *alg) +{ + struct crypto_shash *tfm_to_free = NULL; + + spin_lock_bh(&alg->lock); + --alg->ref_cnt; + WARN_ON(alg->ref_cnt < 0); + if (alg->ref_cnt == 0) { + tfm_to_free = alg->tfm; + alg->tfm = NULL; + } + spin_unlock_bh(&alg->lock); + + if (tfm_to_free) { + pr_info("released tcp-ao %s", alg->alg_name); + crypto_free_shash(tfm_to_free); + } +} + +/* increase reference count on an algorithm that is already in use */ +static void tcp_authopt_alg_incref(struct tcp_authopt_alg_imp *alg) +{ + spin_lock_bh(&alg->lock); + WARN_ON(alg->ref_cnt <= 0); + ++alg->ref_cnt; + spin_unlock_bh(&alg->lock); +} + +static struct crypto_shash *tcp_authopt_alg_get_tfm(struct tcp_authopt_alg_imp *alg) +{ + spin_lock_bh(&alg->lock); + WARN_ON(alg->ref_cnt < 0); + return alg->tfm; +} + +static void tcp_authopt_alg_put_tfm(struct tcp_authopt_alg_imp *alg, struct crypto_shash *tfm) +{ + WARN_ON(tfm != alg->tfm); + spin_unlock_bh(&alg->lock); +} + +static struct crypto_shash *tcp_authopt_get_kdf_shash(struct tcp_authopt_key_info *key) +{ + return tcp_authopt_alg_get_tfm(key->alg); +} + +static void tcp_authopt_put_kdf_shash(struct tcp_authopt_key_info *key, + struct crypto_shash *tfm) +{ + return tcp_authopt_alg_put_tfm(key->alg, tfm); +} + +static struct crypto_shash *tcp_authopt_get_mac_shash(struct tcp_authopt_key_info *key) +{ + return tcp_authopt_alg_get_tfm(key->alg); +} + +static void tcp_authopt_put_mac_shash(struct tcp_authopt_key_info *key, + struct crypto_shash *tfm) +{ + return tcp_authopt_alg_put_tfm(key->alg, tfm); +} + /* checks that ipv4 or ipv6 addr matches. */ static bool ipvx_addr_match(struct sockaddr_storage *a1, struct sockaddr_storage *a2) { if (a1->ss_family != a2->ss_family) @@ -118,17 +269,25 @@ int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt) opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS; return 0; } +static void tcp_authopt_key_free_rcu(struct rcu_head *rcu) +{ + struct tcp_authopt_key_info *key = container_of(rcu, struct tcp_authopt_key_info, rcu); + + tcp_authopt_alg_release(key->alg); + kfree(key); +} + static void tcp_authopt_key_del(struct sock *sk, struct tcp_authopt_info *info, struct tcp_authopt_key_info *key) { hlist_del_rcu(&key->node); atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); + call_rcu(&key->rcu, tcp_authopt_key_free_rcu); } /* free info and keys but don't touch tp->authopt_info */ static void __tcp_authopt_info_free(struct sock *sk, struct tcp_authopt_info *info) { @@ -160,10 +319,12 @@ void tcp_authopt_clear(struct sock *sk) int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen) { struct tcp_authopt_key opt; struct tcp_authopt_info *info; struct tcp_authopt_key_info *key_info; + struct tcp_authopt_alg_imp *alg; + int err; sock_owned_by_me(sk); /* If userspace optlen is too short fill the rest with zeros */ if (optlen > sizeof(opt)) @@ -199,23 +360,35 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen) /* Initialize tcp_authopt_info if not already set */ info = __tcp_authopt_info_get_or_create(sk); if (IS_ERR(info)) return PTR_ERR(info); + /* check the algorithm */ + alg = tcp_authopt_alg_get(opt.alg); + if (!alg) + return -EINVAL; + WARN_ON(alg->alg_id != opt.alg); + err = tcp_authopt_alg_require(alg); + if (err) + return err; + /* If an old key exists with exact ID then remove and replace. * RCU-protected readers might observe both and pick any. */ key_info = tcp_authopt_key_lookup_exact(sk, info, &opt); if (key_info) tcp_authopt_key_del(sk, info, key_info); key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO); - if (!key_info) + if (!key_info) { + tcp_authopt_alg_release(alg); return -ENOMEM; + } key_info->flags = opt.flags & TCP_AUTHOPT_KEY_KNOWN_FLAGS; key_info->send_id = opt.send_id; key_info->recv_id = opt.recv_id; key_info->alg_id = opt.alg; + key_info->alg = alg; key_info->keylen = opt.keylen; memcpy(key_info->key, opt.key, opt.keylen); memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr)); hlist_add_head_rcu(&key_info->node, &info->head);
The crypto_shash API is used in order to compute packet signatures. The API comes with several unfortunate limitations: 1) Allocating a crypto_shash can sleep and must be done in user context. 2) Packet signatures must be computed in softirq context 3) Packet signatures use dynamic "traffic keys" which require exclusive access to crypto_shash for crypto_setkey. The solution is to allocate one crypto_shash for each possible cpu for each algorithm at setsockopt time. The per-cpu tfm is then borrowed from softirq context, signatures are computed and the tfm is returned. The pool for each algorithm is reference counted, initialized at setsockopt time and released in tcp_authopt_key_info's rcu callback Signed-off-by: Leonard Crestez <cdleonard@gmail.com> --- include/net/tcp_authopt.h | 3 + net/ipv4/tcp_authopt.c | 177 +++++++++++++++++++++++++++++++++++++- 2 files changed, 178 insertions(+), 2 deletions(-)