Message ID | 20240611222905.34695-2-kuniyu@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | af_unix: Remove spin_lock_nested() and convert to lock_cmp_fn. | expand |
On Tue, Jun 11, 2024 at 03:28:55PM GMT, Kuniyuki Iwashima wrote: > When created, AF_UNIX socket is put into net->unx.table.buckets[], > and the hash is stored in sk->sk_hash. > > * unbound socket : 0 <= sk_hash <= UNIX_HASH_MOD > > When bind() is called, the socket could be moved to another bucket. > > * pathname socket : 0 <= sk_hash <= UNIX_HASH_MOD > * abstract socket : UNIX_HASH_MOD + 1 <= sk_hash <= UNIX_HASH_MOD * 2 + 1 > > Then, we call unix_table_double_lock() which locks a single bucket > or two. > > Let's define the order as unix_table_lock_cmp_fn() instead of using > spin_lock_nested(). > > The locking is always done in ascending order of sk->sk_hash, which > is the index of buckets/locks array allocated by kvmalloc_array(). > > sk_hash_A < sk_hash_B > <=> &locks[sk_hash_A].dep_map < &locks[sk_hash_B].dep_map > > So, the relation of two sk->sk_hash can be derived from the addresses > of dep_map in the array of locks. > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> > --- > net/unix/af_unix.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c > index 3821f8945b1e..22bb941f174e 100644 > --- a/net/unix/af_unix.c > +++ b/net/unix/af_unix.c > @@ -126,6 +126,15 @@ static spinlock_t bsd_socket_locks[UNIX_HASH_SIZE / 2]; > * hash table is protected with spinlock. > * each socket state is protected by separate spinlock. > */ > +#ifdef CONFIG_PROVE_LOCKING > +#define cmp_ptr(l, r) (((l) > (r)) - ((l) < (r))) > + > +static int unix_table_lock_cmp_fn(const struct lockdep_map *a, > + const struct lockdep_map *b) > +{ > + return cmp_ptr(a, b); > +} > +#endif > > static unsigned int unix_unbound_hash(struct sock *sk) > { > @@ -168,7 +177,7 @@ static void unix_table_double_lock(struct net *net, > swap(hash1, hash2); > > spin_lock(&net->unx.table.locks[hash1]); > - spin_lock_nested(&net->unx.table.locks[hash2], SINGLE_DEPTH_NESTING); > + spin_lock(&net->unx.table.locks[hash2]); > } > > static void unix_table_double_unlock(struct net *net, > @@ -3578,6 +3587,7 @@ static int __net_init unix_net_init(struct net *net) > > for (i = 0; i < UNIX_HASH_SIZE; i++) { > spin_lock_init(&net->unx.table.locks[i]); > + lock_set_cmp_fn(&net->unx.table.locks[i], unix_table_lock_cmp_fn, NULL); > INIT_HLIST_HEAD(&net->unx.table.buckets[i]); > } > > -- > 2.30.2 >
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 3821f8945b1e..22bb941f174e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -126,6 +126,15 @@ static spinlock_t bsd_socket_locks[UNIX_HASH_SIZE / 2]; * hash table is protected with spinlock. * each socket state is protected by separate spinlock. */ +#ifdef CONFIG_PROVE_LOCKING +#define cmp_ptr(l, r) (((l) > (r)) - ((l) < (r))) + +static int unix_table_lock_cmp_fn(const struct lockdep_map *a, + const struct lockdep_map *b) +{ + return cmp_ptr(a, b); +} +#endif static unsigned int unix_unbound_hash(struct sock *sk) { @@ -168,7 +177,7 @@ static void unix_table_double_lock(struct net *net, swap(hash1, hash2); spin_lock(&net->unx.table.locks[hash1]); - spin_lock_nested(&net->unx.table.locks[hash2], SINGLE_DEPTH_NESTING); + spin_lock(&net->unx.table.locks[hash2]); } static void unix_table_double_unlock(struct net *net, @@ -3578,6 +3587,7 @@ static int __net_init unix_net_init(struct net *net) for (i = 0; i < UNIX_HASH_SIZE; i++) { spin_lock_init(&net->unx.table.locks[i]); + lock_set_cmp_fn(&net->unx.table.locks[i], unix_table_lock_cmp_fn, NULL); INIT_HLIST_HEAD(&net->unx.table.buckets[i]); }
When created, AF_UNIX socket is put into net->unx.table.buckets[], and the hash is stored in sk->sk_hash. * unbound socket : 0 <= sk_hash <= UNIX_HASH_MOD When bind() is called, the socket could be moved to another bucket. * pathname socket : 0 <= sk_hash <= UNIX_HASH_MOD * abstract socket : UNIX_HASH_MOD + 1 <= sk_hash <= UNIX_HASH_MOD * 2 + 1 Then, we call unix_table_double_lock() which locks a single bucket or two. Let's define the order as unix_table_lock_cmp_fn() instead of using spin_lock_nested(). The locking is always done in ascending order of sk->sk_hash, which is the index of buckets/locks array allocated by kvmalloc_array(). sk_hash_A < sk_hash_B <=> &locks[sk_hash_A].dep_map < &locks[sk_hash_B].dep_map So, the relation of two sk->sk_hash can be derived from the addresses of dep_map in the array of locks. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> --- net/unix/af_unix.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)