diff mbox series

[v1,4/4] Revert "x86/xen: allow nesting of same lazy mode"

Message ID 20250302145555.3236789-5-ryan.roberts@arm.com (mailing list archive)
State Superseded
Headers show
Series Fix lazy mmu mode | expand

Commit Message

Ryan Roberts March 2, 2025, 2:55 p.m. UTC
Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
added as a solution for a core-mm code change where
arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
manner; see commit bcc6cc832573 ("mm: add default definition of
set_ptes()").

However, now that we have fixed the API to avoid nesting, we no longer
need this capability in the x86 implementation.

Additionally, from code review, I don't believe the fix was ever robust
in the case of preemption occurring while in the nested lazy mode. The
implementation usually deals with preemption by calling
arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
outgoing task if we are in the lazy mmu mode. Then in
xen_end_context_switch(), it restarts the lazy mode by calling
arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
unwind a single level of nesting. If we are in the double nest, then
it's not fully unwound and per-cpu variables are left in a bad state.

So the correct solution is to remove the possibility of nesting from the
higher level (which has now been done) and remove this x86-specific
solution.

Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/x86/include/asm/xen/hypervisor.h | 15 ++-------------
 arch/x86/xen/enlighten_pv.c           |  1 -
 2 files changed, 2 insertions(+), 14 deletions(-)

Comments

David Hildenbrand March 3, 2025, 11:52 a.m. UTC | #1
On 02.03.25 15:55, Ryan Roberts wrote:
> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
> added as a solution for a core-mm code change where
> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
> manner; see commit bcc6cc832573 ("mm: add default definition of
> set_ptes()").
> 
> However, now that we have fixed the API to avoid nesting, we no longer
> need this capability in the x86 implementation.
> 
> Additionally, from code review, I don't believe the fix was ever robust
> in the case of preemption occurring while in the nested lazy mode. The
> implementation usually deals with preemption by calling
> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
> outgoing task if we are in the lazy mmu mode. Then in
> xen_end_context_switch(), it restarts the lazy mode by calling
> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
> unwind a single level of nesting. If we are in the double nest, then
> it's not fully unwound and per-cpu variables are left in a bad state.
> 
> So the correct solution is to remove the possibility of nesting from the
> higher level (which has now been done) and remove this x86-specific
> solution.
> 
> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")

Does this patch here deserve this tag? IIUC, it's rather a cleanup now 
that it was properly fixed elsewhere.

> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>

Acked-by: David Hildenbrand <david@redhat.com>
Ryan Roberts March 3, 2025, 12:33 p.m. UTC | #2
On 03/03/2025 11:52, David Hildenbrand wrote:
> On 02.03.25 15:55, Ryan Roberts wrote:
>> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
>> added as a solution for a core-mm code change where
>> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
>> manner; see commit bcc6cc832573 ("mm: add default definition of
>> set_ptes()").
>>
>> However, now that we have fixed the API to avoid nesting, we no longer
>> need this capability in the x86 implementation.
>>
>> Additionally, from code review, I don't believe the fix was ever robust
>> in the case of preemption occurring while in the nested lazy mode. The
>> implementation usually deals with preemption by calling
>> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
>> outgoing task if we are in the lazy mmu mode. Then in
>> xen_end_context_switch(), it restarts the lazy mode by calling
>> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
>> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
>> unwind a single level of nesting. If we are in the double nest, then
>> it's not fully unwound and per-cpu variables are left in a bad state.
>>
>> So the correct solution is to remove the possibility of nesting from the
>> higher level (which has now been done) and remove this x86-specific
>> solution.
>>
>> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
> 
> Does this patch here deserve this tag? IIUC, it's rather a cleanup now that it
> was properly fixed elsewhere.

Now that nesting is not possible, yes it is just a cleanup. But when nesting was
possible, as far as I can tell it was buggy, as per my description. So it's a
bug bug that won't ever trigger once the other fixes are applied. Happy to
remove the Fixes and then not include it for stable for v2. That's probably
simplest.

> 
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> 
> Acked-by: David Hildenbrand <david@redhat.com>
>
David Hildenbrand March 3, 2025, 12:57 p.m. UTC | #3
On 03.03.25 13:33, Ryan Roberts wrote:
> On 03/03/2025 11:52, David Hildenbrand wrote:
>> On 02.03.25 15:55, Ryan Roberts wrote:
>>> Commit 49147beb0ccb ("x86/xen: allow nesting of same lazy mode") was
>>> added as a solution for a core-mm code change where
>>> arch_[enter|leave]_lazy_mmu_mode() started to be called in a nested
>>> manner; see commit bcc6cc832573 ("mm: add default definition of
>>> set_ptes()").
>>>
>>> However, now that we have fixed the API to avoid nesting, we no longer
>>> need this capability in the x86 implementation.
>>>
>>> Additionally, from code review, I don't believe the fix was ever robust
>>> in the case of preemption occurring while in the nested lazy mode. The
>>> implementation usually deals with preemption by calling
>>> arch_leave_lazy_mmu_mode() from xen_start_context_switch() for the
>>> outgoing task if we are in the lazy mmu mode. Then in
>>> xen_end_context_switch(), it restarts the lazy mode by calling
>>> arch_enter_lazy_mmu_mode() for an incoming task that was in the lazy
>>> mode when it was switched out. But arch_leave_lazy_mmu_mode() will only
>>> unwind a single level of nesting. If we are in the double nest, then
>>> it's not fully unwound and per-cpu variables are left in a bad state.
>>>
>>> So the correct solution is to remove the possibility of nesting from the
>>> higher level (which has now been done) and remove this x86-specific
>>> solution.
>>>
>>> Fixes: 49147beb0ccb ("x86/xen: allow nesting of same lazy mode")
>>
>> Does this patch here deserve this tag? IIUC, it's rather a cleanup now that it
>> was properly fixed elsewhere.
> 
> Now that nesting is not possible, yes it is just a cleanup. But when nesting was
> possible, as far as I can tell it was buggy, as per my description.

Right, I understood that part.

> So it's a
> bug bug that won't ever trigger once the other fixes are applied. Happy to
> remove the Fixes and then not include it for stable for v2. That's probably
> simplest.

I was just curious, because it sounded like the actual fix was the other 
patch. Whatever you think is best :)
diff mbox series

Patch

diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
index a9088250770f..bd0fc69a10a7 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -72,18 +72,10 @@  enum xen_lazy_mode {
 };
 
 DECLARE_PER_CPU(enum xen_lazy_mode, xen_lazy_mode);
-DECLARE_PER_CPU(unsigned int, xen_lazy_nesting);
 
 static inline void enter_lazy(enum xen_lazy_mode mode)
 {
-	enum xen_lazy_mode old_mode = this_cpu_read(xen_lazy_mode);
-
-	if (mode == old_mode) {
-		this_cpu_inc(xen_lazy_nesting);
-		return;
-	}
-
-	BUG_ON(old_mode != XEN_LAZY_NONE);
+	BUG_ON(this_cpu_read(xen_lazy_mode) != XEN_LAZY_NONE);
 
 	this_cpu_write(xen_lazy_mode, mode);
 }
@@ -92,10 +84,7 @@  static inline void leave_lazy(enum xen_lazy_mode mode)
 {
 	BUG_ON(this_cpu_read(xen_lazy_mode) != mode);
 
-	if (this_cpu_read(xen_lazy_nesting) == 0)
-		this_cpu_write(xen_lazy_mode, XEN_LAZY_NONE);
-	else
-		this_cpu_dec(xen_lazy_nesting);
+	this_cpu_write(xen_lazy_mode, XEN_LAZY_NONE);
 }
 
 enum xen_lazy_mode xen_get_lazy_mode(void);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 5e57835e999d..919e4df9380b 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -99,7 +99,6 @@  struct tls_descs {
 };
 
 DEFINE_PER_CPU(enum xen_lazy_mode, xen_lazy_mode) = XEN_LAZY_NONE;
-DEFINE_PER_CPU(unsigned int, xen_lazy_nesting);
 
 enum xen_lazy_mode xen_get_lazy_mode(void)
 {