diff mbox series

locking/local_lock, mm: Replace localtry_ helpers with local_trylock_t type

Message ID 20250401005134.14433-1-alexei.starovoitov@gmail.com (mailing list archive)
State New
Headers show
Series locking/local_lock, mm: Replace localtry_ helpers with local_trylock_t type | expand

Commit Message

Alexei Starovoitov April 1, 2025, 12:51 a.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t").
Remove localtry_*() helpers, since localtry_lock() name might
be misinterpreted as "try lock".

Introduce local_trylock_irqsave() helper that only works
with newly introduced local_trylock_t type.
Note that attempt to use local_trylock_irqsave() with local_lock_t
will cause compilation failure.

Usage and behavior in !PREEMPT_RT:

local_lock_t lock;                     // sizeof(lock) == 0
local_lock_irqsave(&lock, ...);        // irq save
if (local_trylock_irqsave(&lock, ...)) // compilation error

local_trylock_t lock;                  // sizeof(lock) == 4
local_lock_irqsave(&lock, ...);        // irq save and acquired = 1
if (local_trylock_irqsave(&lock, ...)) // if (!acquired) irq save

The existing local_lock_*() macros can be used either with
local_lock_t or local_trylock_t.
With local_trylock_t they set acquired = 1 while local_unlock_*() clears it.

In !PREEMPT_RT local_lock_irqsave(local_lock_t *) disables interrupts
to protect critical section, but it doesn't prevent NMI, so the fully
reentrant code cannot use local_lock_irqsave(local_lock_t *) for
exclusive access.

The local_lock_irqsave(local_trylock_t *) helper disables interrupts
and sets acquired=1, so local_trylock_irqsave(local_trylock_t *) from
NMI attempting to acquire the same lock will return false.

In PREEMPT_RT local_lock_irqsave() maps to preemptible spin_lock().
Map local_trylock_irqsave() to preemptible spin_trylock().
When in hard IRQ or NMI return false right away, since
spin_trylock() is not safe due to explicit locking in the underneath
rt_spin_trylock() implementation. Removing this explicit locking and
attempting only "trylock" is undesired due to PI implications.

Note there is no need to use local_inc for acquired variable,
since it's a percpu variable with strict nesting scopes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/local_lock.h          |  58 +--------
 include/linux/local_lock_internal.h | 193 ++++++++++------------------
 mm/memcontrol.c                     |  39 +++---
 3 files changed, 95 insertions(+), 195 deletions(-)

Comments

Vlastimil Babka April 1, 2025, 2:18 p.m. UTC | #1
On 4/1/25 02:51, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t").
> Remove localtry_*() helpers, since localtry_lock() name might
> be misinterpreted as "try lock".
> 
> Introduce local_trylock_irqsave() helper that only works

Introduce local_trylock[_irqsave]() helpers that only work

?

> with newly introduced local_trylock_t type.
> Note that attempt to use local_trylock_irqsave() with local_lock_t
> will cause compilation failure.
> 
> Usage and behavior in !PREEMPT_RT:
> 
> local_lock_t lock;                     // sizeof(lock) == 0

local_lock(&lock, ...);			// preempt disable

> local_lock_irqsave(&lock, ...);        // irq save
> if (local_trylock_irqsave(&lock, ...)) // compilation error
> 
> local_trylock_t lock;                  // sizeof(lock) == 4

ditto

> local_lock_irqsave(&lock, ...);        // irq save and acquired = 1
> if (local_trylock_irqsave(&lock, ...)) // if (!acquired) irq save
> 
> The existing local_lock_*() macros can be used either with
> local_lock_t or local_trylock_t.
> With local_trylock_t they set acquired = 1 while local_unlock_*() clears it.
> 
> In !PREEMPT_RT local_lock_irqsave(local_lock_t *) disables interrupts
> to protect critical section, but it doesn't prevent NMI, so the fully
> reentrant code cannot use local_lock_irqsave(local_lock_t *) for
> exclusive access.
> 
> The local_lock_irqsave(local_trylock_t *) helper disables interrupts
> and sets acquired=1, so local_trylock_irqsave(local_trylock_t *) from
> NMI attempting to acquire the same lock will return false.
> 
> In PREEMPT_RT local_lock_irqsave() maps to preemptible spin_lock().
> Map local_trylock_irqsave() to preemptible spin_trylock().
> When in hard IRQ or NMI return false right away, since
> spin_trylock() is not safe due to explicit locking in the underneath
> rt_spin_trylock() implementation. Removing this explicit locking and
> attempting only "trylock" is undesired due to PI implications.

And something like:

The local_trylock() without _irqsave can be used to avoid the cost of
disabling/enabling interrupts by only disabling preemption, so
local_trylock() in an interrupt attempting to acquire the same
lock will return false.

> Note there is no need to use local_inc for acquired variable,
> since it's a percpu variable with strict nesting scopes.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Is there a chance this refactoring will make it to -rc1? It would make
basing the further usage of the lock in mm and slab trees much easier.

But squash in the following fixups please:
----8<----
From bc9098ebb58a2958010428c9294547934852ffa2 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Tue, 1 Apr 2025 15:25:21 +0200
Subject: [PATCH] fixup! locking/local_lock, mm: Replace localtry_ helpers with
 local_trylock_t type

---
 include/linux/local_lock.h          |  5 ++---
 include/linux/local_lock_internal.h | 21 +++++++++++++++++++--
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h
index 9262109cca51..7ac9385cd475 100644
--- a/include/linux/local_lock.h
+++ b/include/linux/local_lock.h
@@ -52,15 +52,14 @@
 	__local_unlock_irqrestore(lock, flags)
 
 /**
- * local_trylock_irqsave - Try to acquire a per CPU local lock
+ * local_trylock - Try to acquire a per CPU local lock
  * @lock:	The lock variable
- * @flags:	Storage for interrupt flags
  *
  * The function can be used in any context such as NMI or HARDIRQ. Due to
  * locking constrains it will _always_ fail to acquire the lock in NMI or
  * HARDIRQ context on PREEMPT_RT.
  */
-#define local_trylock(lock, flags)	__local_trylock(lock, flags)
+#define local_trylock(lock)		__local_trylock(lock)
 
 /**
  * local_trylock_irqsave - Try to acquire a per CPU local lock, save and disable
diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index cc79854206df..5634383c8e9e 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -23,7 +23,7 @@ typedef struct {
 #endif
 	/*
 	 * Same layout as local_lock_t with 'acquired' field at the end.
-	 * (local_trylock_t *) will be casted to (local_lock_t *).
+	 * (local_trylock_t *) will be cast to (local_lock_t *).
 	 */
 	int acquired;
 } local_trylock_t;
@@ -80,7 +80,7 @@ do {								\
 	lockdep_init_map_type(&(lock)->dep_map, #lock, &__key,  \
 			      0, LD_WAIT_CONFIG, LD_WAIT_INV,	\
 			      LD_LOCK_PERCPU);			\
-	local_lock_debug_init(lock);				\
+	local_lock_debug_init((local_lock_t *)lock);		\
 } while (0)
 
 #define __spinlock_nested_bh_init(lock)				\
@@ -128,6 +128,23 @@ do {								\
 		__local_lock_acquire(lock);			\
 	} while (0)
 
+#define __local_trylock(lock)					\
+	({							\
+		local_trylock_t *tl;				\
+								\
+		preempt_disable();				\
+		tl = this_cpu_ptr(lock);			\
+		if (READ_ONCE(tl->acquired) == 1) {		\
+			preempt_enable();			\
+			tl = NULL;				\
+		} else {					\
+			WRITE_ONCE(tl->acquired, 1);		\
+			local_trylock_acquire(			\
+				(local_lock_t *)tl);		\
+		}						\
+		!!tl;						\
+	})
+
 #define __local_trylock_irqsave(lock, flags)			\
 	({							\
 		local_trylock_t *tl;				\
Alexei Starovoitov April 1, 2025, 8:55 p.m. UTC | #2
On Tue, Apr 1, 2025 at 7:18 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> > with newly introduced local_trylock_t type.
> > Note that attempt to use local_trylock_irqsave() with local_lock_t
> > will cause compilation failure.
> >
> > Usage and behavior in !PREEMPT_RT:
> >
> > local_lock_t lock;                     // sizeof(lock) == 0
>
> local_lock(&lock, ...);                 // preempt disable

changed to local_lock(&lock);

>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
> Is there a chance this refactoring will make it to -rc1? It would make
> basing the further usage of the lock in mm and slab trees much easier.

+1

> But squash in the following fixups please:

Thanks a bunch. Folded.

And sent v2:
https://lore.kernel.org/bpf/20250401205245.70838-1-alexei.starovoitov@gmail.com/

As soon as Sebastian acks it, I can send bpf PR with these 3 fixes
and other bpf fixes.
Sebastian Andrzej Siewior April 2, 2025, 7:30 a.m. UTC | #3
On 2025-03-31 17:51:34 [-0700], Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t").
> Remove localtry_*() helpers, since localtry_lock() name might
> be misinterpreted as "try lock".

So we back to what you suggested initially. I was more a fan of
explicitly naming things but if this is misleading so be it. So

Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

While at it, could you look at the hunk below and check if it worth it?
The struct duplication and hoping that the first part remains the same,
is hoping. This still relies that the first part remains the same but…

Sebastian

diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index cc79854206dff..dfdeded54348d 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -17,15 +17,8 @@ typedef struct {
 
 /* local_trylock() and local_trylock_irqsave() only work with local_trylock_t */
 typedef struct {
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map	dep_map;
-	struct task_struct	*owner;
-#endif
-	/*
-	 * Same layout as local_lock_t with 'acquired' field at the end.
-	 * (local_trylock_t *) will be casted to (local_lock_t *).
-	 */
-	int acquired;
+	local_lock_t	llock;
+	int		acquired;
 } local_trylock_t;
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -37,6 +30,9 @@ typedef struct {
 	},						\
 	.owner = NULL,
 
+# define LOCAL_TRYLOCK_DEBUG_INIT(lockname)		\
+	.llock = { LOCAL_LOCK_DEBUG_INIT((lockname).llock) },
+
 static inline void local_lock_acquire(local_lock_t *l)
 {
 	lock_map_acquire(&l->dep_map);
@@ -64,6 +60,7 @@ static inline void local_lock_debug_init(local_lock_t *l)
 }
 #else /* CONFIG_DEBUG_LOCK_ALLOC */
 # define LOCAL_LOCK_DEBUG_INIT(lockname)
+# define LOCAL_TRYLOCK_DEBUG_INIT(lockname)
 static inline void local_lock_acquire(local_lock_t *l) { }
 static inline void local_trylock_acquire(local_lock_t *l) { }
 static inline void local_lock_release(local_lock_t *l) { }
@@ -71,6 +68,7 @@ static inline void local_lock_debug_init(local_lock_t *l) { }
 #endif /* !CONFIG_DEBUG_LOCK_ALLOC */
 
 #define INIT_LOCAL_LOCK(lockname)	{ LOCAL_LOCK_DEBUG_INIT(lockname) }
+#define INIT_LOCAL_TRYLOCK(lockname)	{ LOCAL_TRYLOCK_DEBUG_INIT(lockname) }
 
 #define __local_lock_init(lock)					\
 do {								\
@@ -198,6 +196,7 @@ typedef spinlock_t local_lock_t;
 typedef spinlock_t local_trylock_t;
 
 #define INIT_LOCAL_LOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
+#define INIT_LOCAL_TRYLOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
 
 #define __local_lock_init(l)					\
 	do {							\
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 813f5b73e7c8c..c96c1f2b9cf57 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1774,7 +1774,7 @@ struct memcg_stock_pcp {
 #define FLUSHING_CACHED_CHARGE	0
 };
 static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = {
-	.stock_lock = INIT_LOCAL_LOCK(stock_lock),
+	.stock_lock = INIT_LOCAL_TRYLOCK(stock_lock),
 };
 static DEFINE_MUTEX(percpu_charge_mutex);
Vlastimil Babka April 2, 2025, 9:02 a.m. UTC | #4
On 4/2/25 09:30, Sebastian Andrzej Siewior wrote:
> On 2025-03-31 17:51:34 [-0700], Alexei Starovoitov wrote:
>> From: Alexei Starovoitov <ast@kernel.org>
>> 
>> Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t").
>> Remove localtry_*() helpers, since localtry_lock() name might
>> be misinterpreted as "try lock".
> 
> So we back to what you suggested initially. I was more a fan of
> explicitly naming things but if this is misleading so be it. So
> 
> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> While at it, could you look at the hunk below and check if it worth it?
> The struct duplication and hoping that the first part remains the same,
> is hoping. This still relies that the first part remains the same but…

I've updated your fixups to v2
https://lore.kernel.org/all/20250401205245.70838-1-alexei.starovoitov@gmail.com/

and to support runtime local_trylock_init(), and it's at the end of my e-mail

But I also thought we could go all the way with removing casting in
that way and stop relying on the same layout implicitly.

So I rewrote this:

#define __local_lock_acquire(lock)                                      \
        do {                                                            \
                local_trylock_t *tl;                                    \
                local_lock_t *l;                                        \
                                                                        \
                _Generic((lock),                                        \
                        local_lock_t *: ({                      	\
                                l = this_cpu_ptr(lock);                 \
                        }),                                             \
                        local_trylock_t *: ({                   	\
                                tl = this_cpu_ptr(lock);                \
                                l = &tl->llock;                         \
                                lockdep_assert(tl->acquired == 0);      \
                                WRITE_ONCE(tl->acquired, 1);            \
                        }),                                             \
                        default:(void)0);                               \
                local_lock_acquire(l);                                  \
        } while (0)

But I'm getting weird errors:

./include/linux/local_lock_internal.h:107:36: error: assignment to ‘local_trylock_t *’ from incompatible pointer type ‘local_lock_t *’ [-Wincompatible-pointer-types]
  107 |                                 tl = this_cpu_ptr(lock);                \

coming from the guard expansions. I don't understand why it goes to the
_Generic() "branch" of local_trylock_t * with a local_lock_t *.

----8<----
From eeeb928ccc6d86e93cb573fb93ce4f6aeb8576fb Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Wed, 2 Apr 2025 10:13:28 +0200
Subject: [PATCH] fixup! locking/local_lock, mm: Replace localtry_ helpers with
 local_trylock_t type

---
 include/linux/local_lock.h          |  5 +++++
 include/linux/local_lock_internal.h | 23 +++++++++++++----------
 mm/memcontrol.c                     |  2 +-
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h
index 7ac9385cd475..16a2ee4f8310 100644
--- a/include/linux/local_lock.h
+++ b/include/linux/local_lock.h
@@ -51,6 +51,11 @@
 #define local_unlock_irqrestore(lock, flags)			\
 	__local_unlock_irqrestore(lock, flags)
 
+/**
+ * local_lock_init - Runtime initialize a lock instance
+ */
+#define local_trylock_init(lock)	__local_trylock_init(lock)
+
 /**
  * local_trylock - Try to acquire a per CPU local lock
  * @lock:	The lock variable
diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index 2389ae4f69a6..6ccb2c4ef86f 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -17,15 +17,8 @@ typedef struct {
 
 /* local_trylock() and local_trylock_irqsave() only work with local_trylock_t */
 typedef struct {
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map	dep_map;
-	struct task_struct	*owner;
-#endif
-	/*
-	 * Same layout as local_lock_t with 'acquired' field at the end.
-	 * (local_trylock_t *) will be cast to (local_lock_t *).
-	 */
-	int acquired;
+	local_lock_t	llock;
+	int		acquired;
 } local_trylock_t;
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -37,6 +30,9 @@ typedef struct {
 	},						\
 	.owner = NULL,
 
+# define LOCAL_TRYLOCK_DEBUG_INIT(lockname)		\
+	.llock = { LOCAL_LOCK_DEBUG_INIT((lockname).llock) },
+
 static inline void local_lock_acquire(local_lock_t *l)
 {
 	lock_map_acquire(&l->dep_map);
@@ -64,6 +60,7 @@ static inline void local_lock_debug_init(local_lock_t *l)
 }
 #else /* CONFIG_DEBUG_LOCK_ALLOC */
 # define LOCAL_LOCK_DEBUG_INIT(lockname)
+# define LOCAL_TRYLOCK_DEBUG_INIT(lockname)
 static inline void local_lock_acquire(local_lock_t *l) { }
 static inline void local_trylock_acquire(local_lock_t *l) { }
 static inline void local_lock_release(local_lock_t *l) { }
@@ -71,6 +68,7 @@ static inline void local_lock_debug_init(local_lock_t *l) { }
 #endif /* !CONFIG_DEBUG_LOCK_ALLOC */
 
 #define INIT_LOCAL_LOCK(lockname)	{ LOCAL_LOCK_DEBUG_INIT(lockname) }
+#define INIT_LOCAL_TRYLOCK(lockname)	{ LOCAL_TRYLOCK_DEBUG_INIT(lockname) }
 
 #define __local_lock_init(lock)					\
 do {								\
@@ -80,9 +78,11 @@ do {								\
 	lockdep_init_map_type(&(lock)->dep_map, #lock, &__key,  \
 			      0, LD_WAIT_CONFIG, LD_WAIT_INV,	\
 			      LD_LOCK_PERCPU);			\
-	local_lock_debug_init((local_lock_t *)lock);		\
+	local_lock_debug_init(lock);				\
 } while (0)
 
+#define __local_trylock_init(lock) __local_lock_init(lock.llock)
+
 #define __spinlock_nested_bh_init(lock)				\
 do {								\
 	static struct lock_class_key __key;			\
@@ -215,12 +215,15 @@ typedef spinlock_t local_lock_t;
 typedef spinlock_t local_trylock_t;
 
 #define INIT_LOCAL_LOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
+#define INIT_LOCAL_TRYLOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
 
 #define __local_lock_init(l)					\
 	do {							\
 		local_spin_lock_init((l));			\
 	} while (0)
 
+#define __local_trylock_init(l)			__local_lock_init(l)
+
 #define __local_lock(__lock)					\
 	do {							\
 		migrate_disable();				\
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bca86961754e..0401fb7b6c6a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1754,7 +1754,7 @@ struct memcg_stock_pcp {
 #define FLUSHING_CACHED_CHARGE	0
 };
 static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = {
-	.stock_lock = INIT_LOCAL_LOCK(stock_lock),
+	.stock_lock = INIT_LOCAL_TRYLOCK(stock_lock),
 };
 static DEFINE_MUTEX(percpu_charge_mutex);
diff mbox series

Patch

diff --git a/include/linux/local_lock.h b/include/linux/local_lock.h
index 1a0bc35839e3..9262109cca51 100644
--- a/include/linux/local_lock.h
+++ b/include/linux/local_lock.h
@@ -52,44 +52,19 @@ 
 	__local_unlock_irqrestore(lock, flags)
 
 /**
- * localtry_lock_init - Runtime initialize a lock instance
- */
-#define localtry_lock_init(lock)		__localtry_lock_init(lock)
-
-/**
- * localtry_lock - Acquire a per CPU local lock
- * @lock:	The lock variable
- */
-#define localtry_lock(lock)		__localtry_lock(lock)
-
-/**
- * localtry_lock_irq - Acquire a per CPU local lock and disable interrupts
- * @lock:	The lock variable
- */
-#define localtry_lock_irq(lock)		__localtry_lock_irq(lock)
-
-/**
- * localtry_lock_irqsave - Acquire a per CPU local lock, save and disable
- *			 interrupts
+ * local_trylock_irqsave - Try to acquire a per CPU local lock
  * @lock:	The lock variable
  * @flags:	Storage for interrupt flags
- */
-#define localtry_lock_irqsave(lock, flags)				\
-	__localtry_lock_irqsave(lock, flags)
-
-/**
- * localtry_trylock - Try to acquire a per CPU local lock.
- * @lock:	The lock variable
  *
  * The function can be used in any context such as NMI or HARDIRQ. Due to
  * locking constrains it will _always_ fail to acquire the lock in NMI or
  * HARDIRQ context on PREEMPT_RT.
  */
-#define localtry_trylock(lock)		__localtry_trylock(lock)
+#define local_trylock(lock, flags)	__local_trylock(lock, flags)
 
 /**
- * localtry_trylock_irqsave - Try to acquire a per CPU local lock, save and disable
- *			      interrupts if acquired
+ * local_trylock_irqsave - Try to acquire a per CPU local lock, save and disable
+ *			   interrupts if acquired
  * @lock:	The lock variable
  * @flags:	Storage for interrupt flags
  *
@@ -97,29 +72,8 @@ 
  * locking constrains it will _always_ fail to acquire the lock in NMI or
  * HARDIRQ context on PREEMPT_RT.
  */
-#define localtry_trylock_irqsave(lock, flags)				\
-	__localtry_trylock_irqsave(lock, flags)
-
-/**
- * local_unlock - Release a per CPU local lock
- * @lock:	The lock variable
- */
-#define localtry_unlock(lock)		__localtry_unlock(lock)
-
-/**
- * local_unlock_irq - Release a per CPU local lock and enable interrupts
- * @lock:	The lock variable
- */
-#define localtry_unlock_irq(lock)		__localtry_unlock_irq(lock)
-
-/**
- * localtry_unlock_irqrestore - Release a per CPU local lock and restore
- *			      interrupt flags
- * @lock:	The lock variable
- * @flags:      Interrupt flags to restore
- */
-#define localtry_unlock_irqrestore(lock, flags)			\
-	__localtry_unlock_irqrestore(lock, flags)
+#define local_trylock_irqsave(lock, flags)			\
+	__local_trylock_irqsave(lock, flags)
 
 DEFINE_GUARD(local_lock, local_lock_t __percpu*,
 	     local_lock(_T),
diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h
index 67bd13d142fa..cc79854206df 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -15,10 +15,18 @@  typedef struct {
 #endif
 } local_lock_t;
 
+/* local_trylock() and local_trylock_irqsave() only work with local_trylock_t */
 typedef struct {
-	local_lock_t	llock;
-	unsigned int	acquired;
-} localtry_lock_t;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lockdep_map	dep_map;
+	struct task_struct	*owner;
+#endif
+	/*
+	 * Same layout as local_lock_t with 'acquired' field at the end.
+	 * (local_trylock_t *) will be casted to (local_lock_t *).
+	 */
+	int acquired;
+} local_trylock_t;
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define LOCAL_LOCK_DEBUG_INIT(lockname)		\
@@ -63,7 +71,6 @@  static inline void local_lock_debug_init(local_lock_t *l) { }
 #endif /* !CONFIG_DEBUG_LOCK_ALLOC */
 
 #define INIT_LOCAL_LOCK(lockname)	{ LOCAL_LOCK_DEBUG_INIT(lockname) }
-#define INIT_LOCALTRY_LOCK(lockname)	{ .llock = { LOCAL_LOCK_DEBUG_INIT(lockname.llock) }}
 
 #define __local_lock_init(lock)					\
 do {								\
@@ -87,39 +94,88 @@  do {								\
 	local_lock_debug_init(lock);				\
 } while (0)
 
+#define __local_lock_acquire(lock)					\
+	do {								\
+		local_trylock_t *tl;					\
+		local_lock_t *l;					\
+									\
+		l = (local_lock_t *)this_cpu_ptr(lock);			\
+		tl = (local_trylock_t *)l;				\
+		_Generic((lock),					\
+			local_trylock_t *: ({				\
+				lockdep_assert(tl->acquired == 0);	\
+				WRITE_ONCE(tl->acquired, 1);		\
+			}),						\
+			default:(void)0);				\
+		local_lock_acquire(l);					\
+	} while (0)
+
 #define __local_lock(lock)					\
 	do {							\
 		preempt_disable();				\
-		local_lock_acquire(this_cpu_ptr(lock));		\
+		__local_lock_acquire(lock);			\
 	} while (0)
 
 #define __local_lock_irq(lock)					\
 	do {							\
 		local_irq_disable();				\
-		local_lock_acquire(this_cpu_ptr(lock));		\
+		__local_lock_acquire(lock);			\
 	} while (0)
 
 #define __local_lock_irqsave(lock, flags)			\
 	do {							\
 		local_irq_save(flags);				\
-		local_lock_acquire(this_cpu_ptr(lock));		\
+		__local_lock_acquire(lock);			\
+	} while (0)
+
+#define __local_trylock_irqsave(lock, flags)			\
+	({							\
+		local_trylock_t *tl;				\
+								\
+		local_irq_save(flags);				\
+		tl = this_cpu_ptr(lock);			\
+		if (READ_ONCE(tl->acquired) == 1) {		\
+			local_irq_restore(flags);		\
+			tl = NULL;				\
+		} else {					\
+			WRITE_ONCE(tl->acquired, 1);		\
+			local_trylock_acquire(			\
+				(local_lock_t *)tl);		\
+		}						\
+		!!tl;						\
+	})
+
+#define __local_lock_release(lock)					\
+	do {								\
+		local_trylock_t *tl;					\
+		local_lock_t *l;					\
+									\
+		l = (local_lock_t *)this_cpu_ptr(lock);			\
+		tl = (local_trylock_t *)l;				\
+		_Generic((lock),					\
+			local_trylock_t *: ({				\
+				lockdep_assert(tl->acquired == 1);	\
+				WRITE_ONCE(tl->acquired, 0);		\
+			}),						\
+			default:(void)0);				\
+		local_lock_release(l);					\
 	} while (0)
 
 #define __local_unlock(lock)					\
 	do {							\
-		local_lock_release(this_cpu_ptr(lock));		\
+		__local_lock_release(lock);			\
 		preempt_enable();				\
 	} while (0)
 
 #define __local_unlock_irq(lock)				\
 	do {							\
-		local_lock_release(this_cpu_ptr(lock));		\
+		__local_lock_release(lock);			\
 		local_irq_enable();				\
 	} while (0)
 
 #define __local_unlock_irqrestore(lock, flags)			\
 	do {							\
-		local_lock_release(this_cpu_ptr(lock));		\
+		__local_lock_release(lock);			\
 		local_irq_restore(flags);			\
 	} while (0)
 
@@ -132,104 +188,6 @@  do {								\
 #define __local_unlock_nested_bh(lock)				\
 	local_lock_release(this_cpu_ptr(lock))
 
-/* localtry_lock_t variants */
-
-#define __localtry_lock_init(lock)				\
-do {								\
-	__local_lock_init(&(lock)->llock);			\
-	WRITE_ONCE((lock)->acquired, 0);			\
-} while (0)
-
-#define __localtry_lock(lock)					\
-	do {							\
-		localtry_lock_t *lt;				\
-		preempt_disable();				\
-		lt = this_cpu_ptr(lock);			\
-		local_lock_acquire(&lt->llock);			\
-		WRITE_ONCE(lt->acquired, 1);			\
-	} while (0)
-
-#define __localtry_lock_irq(lock)				\
-	do {							\
-		localtry_lock_t *lt;				\
-		local_irq_disable();				\
-		lt = this_cpu_ptr(lock);			\
-		local_lock_acquire(&lt->llock);			\
-		WRITE_ONCE(lt->acquired, 1);			\
-	} while (0)
-
-#define __localtry_lock_irqsave(lock, flags)			\
-	do {							\
-		localtry_lock_t *lt;				\
-		local_irq_save(flags);				\
-		lt = this_cpu_ptr(lock);			\
-		local_lock_acquire(&lt->llock);			\
-		WRITE_ONCE(lt->acquired, 1);			\
-	} while (0)
-
-#define __localtry_trylock(lock)				\
-	({							\
-		localtry_lock_t *lt;				\
-		bool _ret;					\
-								\
-		preempt_disable();				\
-		lt = this_cpu_ptr(lock);			\
-		if (!READ_ONCE(lt->acquired)) {			\
-			WRITE_ONCE(lt->acquired, 1);		\
-			local_trylock_acquire(&lt->llock);	\
-			_ret = true;				\
-		} else {					\
-			_ret = false;				\
-			preempt_enable();			\
-		}						\
-		_ret;						\
-	})
-
-#define __localtry_trylock_irqsave(lock, flags)			\
-	({							\
-		localtry_lock_t *lt;				\
-		bool _ret;					\
-								\
-		local_irq_save(flags);				\
-		lt = this_cpu_ptr(lock);			\
-		if (!READ_ONCE(lt->acquired)) {			\
-			WRITE_ONCE(lt->acquired, 1);		\
-			local_trylock_acquire(&lt->llock);	\
-			_ret = true;				\
-		} else {					\
-			_ret = false;				\
-			local_irq_restore(flags);		\
-		}						\
-		_ret;						\
-	})
-
-#define __localtry_unlock(lock)					\
-	do {							\
-		localtry_lock_t *lt;				\
-		lt = this_cpu_ptr(lock);			\
-		WRITE_ONCE(lt->acquired, 0);			\
-		local_lock_release(&lt->llock);			\
-		preempt_enable();				\
-	} while (0)
-
-#define __localtry_unlock_irq(lock)				\
-	do {							\
-		localtry_lock_t *lt;				\
-		lt = this_cpu_ptr(lock);			\
-		WRITE_ONCE(lt->acquired, 0);			\
-		local_lock_release(&lt->llock);			\
-		local_irq_enable();				\
-	} while (0)
-
-#define __localtry_unlock_irqrestore(lock, flags)		\
-	do {							\
-		localtry_lock_t *lt;				\
-		lt = this_cpu_ptr(lock);			\
-		WRITE_ONCE(lt->acquired, 0);			\
-		local_lock_release(&lt->llock);			\
-		local_irq_restore(flags);			\
-	} while (0)
-
 #else /* !CONFIG_PREEMPT_RT */
 
 /*
@@ -237,10 +195,9 @@  do {								\
  * critical section while staying preemptible.
  */
 typedef spinlock_t local_lock_t;
-typedef spinlock_t localtry_lock_t;
+typedef spinlock_t local_trylock_t;
 
 #define INIT_LOCAL_LOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
-#define INIT_LOCALTRY_LOCK(lockname) INIT_LOCAL_LOCK(lockname)
 
 #define __local_lock_init(l)					\
 	do {							\
@@ -283,17 +240,7 @@  do {								\
 	spin_unlock(this_cpu_ptr((lock)));			\
 } while (0)
 
-/* localtry_lock_t variants */
-
-#define __localtry_lock_init(lock)			__local_lock_init(lock)
-#define __localtry_lock(lock)				__local_lock(lock)
-#define __localtry_lock_irq(lock)			__local_lock(lock)
-#define __localtry_lock_irqsave(lock, flags)		__local_lock_irqsave(lock, flags)
-#define __localtry_unlock(lock)				__local_unlock(lock)
-#define __localtry_unlock_irq(lock)			__local_unlock(lock)
-#define __localtry_unlock_irqrestore(lock, flags)	__local_unlock_irqrestore(lock, flags)
-
-#define __localtry_trylock(lock)				\
+#define __local_trylock(lock)					\
 	({							\
 		int __locked;					\
 								\
@@ -308,11 +255,11 @@  do {								\
 		__locked;					\
 	})
 
-#define __localtry_trylock_irqsave(lock, flags)			\
+#define __local_trylock_irqsave(lock, flags)			\
 	({							\
 		typecheck(unsigned long, flags);		\
 		flags = 0;					\
-		__localtry_trylock(lock);			\
+		__local_trylock(lock);				\
 	})
 
 #endif /* CONFIG_PREEMPT_RT */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 83c2df73e4b6..bca86961754e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1739,7 +1739,7 @@  void mem_cgroup_print_oom_group(struct mem_cgroup *memcg)
 }
 
 struct memcg_stock_pcp {
-	localtry_lock_t stock_lock;
+	local_trylock_t stock_lock;
 	struct mem_cgroup *cached; /* this never be root cgroup */
 	unsigned int nr_pages;
 
@@ -1754,7 +1754,7 @@  struct memcg_stock_pcp {
 #define FLUSHING_CACHED_CHARGE	0
 };
 static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) = {
-	.stock_lock = INIT_LOCALTRY_LOCK(stock_lock),
+	.stock_lock = INIT_LOCAL_LOCK(stock_lock),
 };
 static DEFINE_MUTEX(percpu_charge_mutex);
 
@@ -1785,11 +1785,10 @@  static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages,
 	if (nr_pages > MEMCG_CHARGE_BATCH)
 		return ret;
 
-	if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) {
-		if (!gfpflags_allow_spinning(gfp_mask))
-			return ret;
-		localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
-	}
+	if (gfpflags_allow_spinning(gfp_mask))
+		local_lock_irqsave(&memcg_stock.stock_lock, flags);
+	else if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags))
+		return ret;
 
 	stock = this_cpu_ptr(&memcg_stock);
 	stock_pages = READ_ONCE(stock->nr_pages);
@@ -1798,7 +1797,7 @@  static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages,
 		ret = true;
 	}
 
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 
 	return ret;
 }
@@ -1837,14 +1836,14 @@  static void drain_local_stock(struct work_struct *dummy)
 	 * drain_stock races is that we always operate on local CPU stock
 	 * here with IRQ disabled
 	 */
-	localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
+	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 
 	stock = this_cpu_ptr(&memcg_stock);
 	old = drain_obj_stock(stock);
 	drain_stock(stock);
 	clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
 
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 	obj_cgroup_put(old);
 }
 
@@ -1874,7 +1873,7 @@  static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
 	unsigned long flags;
 
-	if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) {
+	if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) {
 		/*
 		 * In case of unlikely failure to lock percpu stock_lock
 		 * uncharge memcg directly.
@@ -1887,7 +1886,7 @@  static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 		return;
 	}
 	__refill_stock(memcg, nr_pages);
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 }
 
 /*
@@ -1944,9 +1943,9 @@  static int memcg_hotplug_cpu_dead(unsigned int cpu)
 	stock = &per_cpu(memcg_stock, cpu);
 
 	/* drain_obj_stock requires stock_lock */
-	localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
+	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 	old = drain_obj_stock(stock);
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 
 	drain_stock(stock);
 	obj_cgroup_put(old);
@@ -2729,7 +2728,7 @@  static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
 	unsigned long flags;
 	int *bytes;
 
-	localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
+	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 	stock = this_cpu_ptr(&memcg_stock);
 
 	/*
@@ -2782,7 +2781,7 @@  static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
 	if (nr)
 		__mod_objcg_mlstate(objcg, pgdat, idx, nr);
 
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 	obj_cgroup_put(old);
 }
 
@@ -2792,7 +2791,7 @@  static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 	unsigned long flags;
 	bool ret = false;
 
-	localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
+	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 
 	stock = this_cpu_ptr(&memcg_stock);
 	if (objcg == READ_ONCE(stock->cached_objcg) && stock->nr_bytes >= nr_bytes) {
@@ -2800,7 +2799,7 @@  static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
 		ret = true;
 	}
 
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 
 	return ret;
 }
@@ -2892,7 +2891,7 @@  static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
 	unsigned long flags;
 	unsigned int nr_pages = 0;
 
-	localtry_lock_irqsave(&memcg_stock.stock_lock, flags);
+	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 
 	stock = this_cpu_ptr(&memcg_stock);
 	if (READ_ONCE(stock->cached_objcg) != objcg) { /* reset if necessary */
@@ -2910,7 +2909,7 @@  static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
 		stock->nr_bytes &= (PAGE_SIZE - 1);
 	}
 
-	localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags);
+	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
 	obj_cgroup_put(old);
 
 	if (nr_pages)