diff mbox series

[v3,2/2] x86/flushtlb: remove flush_area check on system state

Message ID 20220525081311.13416-3-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/mm: fix regression when booting with CET-SS | expand

Commit Message

Roger Pau Monné May 25, 2022, 8:13 a.m. UTC
Booting with Shadow Stacks leads to the following assert on a debug
hypervisor:

Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
CPU:    0
RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
[...]
Xen call trace:
   [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
   [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
   [<ffff82d0404474f9>] F arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
   [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
   [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
   [<ffff82d040203344>] F __high_start+0x94/0xa0

This is due to SYS_STATE_smp_boot being set before calling
alternative_branches(), and the flush in modify_xen_mappings() then
using flush_area_all() with interrupts disabled.  Note that
alternative_branches() is called before APs are started, so the flush
must be a local one (and indeed the cpumask passed to
flush_area_mask() just contains one CPU).

Take the opportunity to simplify a bit the logic and make flush_area()
an alias of flush_area_all() in mm.c, taking into account that
cpu_online_map just contains the BSP before APs are started.  This
requires widening the assert in flush_area_mask() to allow being
called with interrupts disabled as long as it's strictly a local only
flush.

The overall result is that a conditional can be removed from
flush_area().

While there also introduce an ASSERT to check that a vCPU state flush
is not issued for the local CPU only.

Fixes: (78e072bc37 'x86/mm: avoid inadvertently degrading a TLB flush to local only')
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Fix commit message.
 - Keep flush_area() in mm.c and reduce code churn.

Changes since v1:
 - Add an extra assert.
 - Rename flush_area() to flush_area_all().
---
 xen/arch/x86/mm.c  | 9 ++-------
 xen/arch/x86/smp.c | 5 ++++-
 2 files changed, 6 insertions(+), 8 deletions(-)

Comments

Jan Beulich May 25, 2022, 8:41 a.m. UTC | #1
On 25.05.2022 10:13, Roger Pau Monne wrote:
> Booting with Shadow Stacks leads to the following assert on a debug
> hypervisor:
> 
> Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
> ----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
> CPU:    0
> RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
> [...]
> Xen call trace:
>    [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
>    [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
>    [<ffff82d0404474f9>] F arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
>    [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
>    [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
>    [<ffff82d040203344>] F __high_start+0x94/0xa0
> 
> This is due to SYS_STATE_smp_boot being set before calling
> alternative_branches(), and the flush in modify_xen_mappings() then
> using flush_area_all() with interrupts disabled.  Note that
> alternative_branches() is called before APs are started, so the flush
> must be a local one (and indeed the cpumask passed to
> flush_area_mask() just contains one CPU).
> 
> Take the opportunity to simplify a bit the logic and make flush_area()
> an alias of flush_area_all() in mm.c, taking into account that
> cpu_online_map just contains the BSP before APs are started.  This
> requires widening the assert in flush_area_mask() to allow being
> called with interrupts disabled as long as it's strictly a local only
> flush.
> 
> The overall result is that a conditional can be removed from
> flush_area().
> 
> While there also introduce an ASSERT to check that a vCPU state flush
> is not issued for the local CPU only.
> 
> Fixes: (78e072bc37 'x86/mm: avoid inadvertently degrading a TLB flush to local only')
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with ...

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5070,13 +5070,8 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
>  #define l1f_to_lNf(f) (((f) & _PAGE_PRESENT) ? ((f) |  _PAGE_PSE) : (f))
>  #define lNf_to_l1f(f) (((f) & _PAGE_PRESENT) ? ((f) & ~_PAGE_PSE) : (f))
>  
> -/*
> - * map_pages_to_xen() can be called early in boot before any other
> - * CPUs are online. Use flush_area_local() in this case.
> - */
> -#define flush_area(v,f) (system_state < SYS_STATE_smp_boot ?    \
> -                         flush_area_local((const void *)v, f) : \
> -                         flush_area_all((const void *)v, f))
> +/* flush_area_all() can be used prior to any other CPU being online.  */
> +#define flush_area(v, f) flush_area_all((const void *)v, f)

... v properly parenthesized here as the code is being touched anyway:
One less Misra-C violation. This surely can be done while committing.

Jan
Roger Pau Monné May 25, 2022, 9:32 a.m. UTC | #2
On Wed, May 25, 2022 at 10:41:51AM +0200, Jan Beulich wrote:
> On 25.05.2022 10:13, Roger Pau Monne wrote:
> > Booting with Shadow Stacks leads to the following assert on a debug
> > hypervisor:
> > 
> > Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
> > ----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
> > CPU:    0
> > RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
> > [...]
> > Xen call trace:
> >    [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
> >    [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
> >    [<ffff82d0404474f9>] F arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
> >    [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
> >    [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
> >    [<ffff82d040203344>] F __high_start+0x94/0xa0
> > 
> > This is due to SYS_STATE_smp_boot being set before calling
> > alternative_branches(), and the flush in modify_xen_mappings() then
> > using flush_area_all() with interrupts disabled.  Note that
> > alternative_branches() is called before APs are started, so the flush
> > must be a local one (and indeed the cpumask passed to
> > flush_area_mask() just contains one CPU).
> > 
> > Take the opportunity to simplify a bit the logic and make flush_area()
> > an alias of flush_area_all() in mm.c, taking into account that
> > cpu_online_map just contains the BSP before APs are started.  This
> > requires widening the assert in flush_area_mask() to allow being
> > called with interrupts disabled as long as it's strictly a local only
> > flush.
> > 
> > The overall result is that a conditional can be removed from
> > flush_area().
> > 
> > While there also introduce an ASSERT to check that a vCPU state flush
> > is not issued for the local CPU only.
> > 
> > Fixes: (78e072bc37 'x86/mm: avoid inadvertently degrading a TLB flush to local only')
> > Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with ...
> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -5070,13 +5070,8 @@ l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
> >  #define l1f_to_lNf(f) (((f) & _PAGE_PRESENT) ? ((f) |  _PAGE_PSE) : (f))
> >  #define lNf_to_l1f(f) (((f) & _PAGE_PRESENT) ? ((f) & ~_PAGE_PSE) : (f))
> >  
> > -/*
> > - * map_pages_to_xen() can be called early in boot before any other
> > - * CPUs are online. Use flush_area_local() in this case.
> > - */
> > -#define flush_area(v,f) (system_state < SYS_STATE_smp_boot ?    \
> > -                         flush_area_local((const void *)v, f) : \
> > -                         flush_area_all((const void *)v, f))
> > +/* flush_area_all() can be used prior to any other CPU being online.  */
> > +#define flush_area(v, f) flush_area_all((const void *)v, f)
> 
> ... v properly parenthesized here as the code is being touched anyway:
> One less Misra-C violation. This surely can be done while committing.

Indeed.  I had my addition properly parenthesized, but forgot to do it
here when moving the line.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index bbb834c3fb..038f71ecf4 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5070,13 +5070,8 @@  l1_pgentry_t *virt_to_xen_l1e(unsigned long v)
 #define l1f_to_lNf(f) (((f) & _PAGE_PRESENT) ? ((f) |  _PAGE_PSE) : (f))
 #define lNf_to_l1f(f) (((f) & _PAGE_PRESENT) ? ((f) & ~_PAGE_PSE) : (f))
 
-/*
- * map_pages_to_xen() can be called early in boot before any other
- * CPUs are online. Use flush_area_local() in this case.
- */
-#define flush_area(v,f) (system_state < SYS_STATE_smp_boot ?    \
-                         flush_area_local((const void *)v, f) : \
-                         flush_area_all((const void *)v, f))
+/* flush_area_all() can be used prior to any other CPU being online.  */
+#define flush_area(v, f) flush_area_all((const void *)v, f)
 
 #define L3T_INIT(page) (page) = ZERO_BLOCK_PTR
 
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 0a02086966..b42603c351 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -262,7 +262,10 @@  void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int flags)
 {
     unsigned int cpu = smp_processor_id();
 
-    ASSERT(local_irq_is_enabled());
+    /* Local flushes can be performed with interrupts disabled. */
+    ASSERT(local_irq_is_enabled() || cpumask_subset(mask, cpumask_of(cpu)));
+    /* Exclude use of FLUSH_VCPU_STATE for the local CPU. */
+    ASSERT(!cpumask_test_cpu(cpu, mask) || !(flags & FLUSH_VCPU_STATE));
 
     if ( (flags & ~(FLUSH_VCPU_STATE | FLUSH_ORDER_MASK)) &&
          cpumask_test_cpu(cpu, mask) )