[1/4] arm64: mm: Fix TLBI vs ASID rollover

Message ID	20210806113109.2475-2-will@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=HLeY=M5=lists.infradead.org=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6ADF961050 From: Will Deacon <will@kernel.org> To: linux-arm-kernel@lists.infradead.org Cc: kernel-team@android.com, Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>, Marc Zyngier <maz@kernel.org>, Jade Alglave <jade.alglave@arm.com>, Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>, kvmarm@lists.cs.columbia.edu, linux-arch@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH 1/4] arm64: mm: Fix TLBI vs ASID rollover Date: Fri, 6 Aug 2021 12:31:04 +0100 Message-Id: <20210806113109.2475-2-will@kernel.org> In-Reply-To: <20210806113109.2475-1-will@kernel.org> References: <20210806113109.2475-1-will@kernel.org> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	Fix racing TLBI with ASID/VMID reallocation \| expand [0/4] Fix racing TLBI with ASID/VMID reallocation [1/4] arm64: mm: Fix TLBI vs ASID rollover [2/4] KVM: arm64: Move kern_hyp_va() usage in __load_guest_stage2() into the callers [3/4] KVM: arm64: Convert the host S2 over to __load_guest_stage2() [4/4] KVM: arm64: Upgrade VMID accesses to {READ,WRITE}_ONCE

Message ID

20210806113109.2475-2-will@kernel.org (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6ADF961050
From: Will Deacon <will@kernel.org>
To: linux-arm-kernel@lists.infradead.org
Cc: kernel-team@android.com, Will Deacon <will@kernel.org>,
 Catalin Marinas <catalin.marinas@arm.com>, Marc Zyngier <maz@kernel.org>,
 Jade Alglave <jade.alglave@arm.com>,
 Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
 kvmarm@lists.cs.columbia.edu, linux-arch@vger.kernel.org,
 stable@vger.kernel.org
Subject: [PATCH 1/4] arm64: mm: Fix TLBI vs ASID rollover
Date: Fri,  6 Aug 2021 12:31:04 +0100
Message-Id: <20210806113109.2475-2-will@kernel.org>
In-Reply-To: <20210806113109.2475-1-will@kernel.org>
References: <20210806113109.2475-1-will@kernel.org>
MIME-Version: 1.0
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

Fix racing TLBI with ASID/VMID reallocation | expand

Commit Message

Will Deacon Aug. 6, 2021, 11:31 a.m. UTC

When switching to an 'mm_struct' for the first time following an ASID
rollover, a new ASID may be allocated and assigned to 'mm->context.id'.
This reassignment can happen concurrently with other operations on the
mm, such as unmapping pages and subsequently issuing TLB invalidation.

Consequently, we need to ensure that (a) accesses to 'mm->context.id'
are atomic and (b) all page-table updates made prior to a TLBI using the
old ASID are guaranteed to be visible to CPUs running with the new ASID.

This was found by inspection after reviewing the VMID changes from
Shameer but it looks like a real (yet hard to hit) bug.

Cc: <stable@vger.kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Jade Alglave <jade.alglave@arm.com>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mmu.h      | 29 +++++++++++++++++++++++++----
 arch/arm64/include/asm/tlbflush.h | 11 ++++++-----
 2 files changed, 31 insertions(+), 9 deletions(-)

Comments

Catalin Marinas Aug. 6, 2021, 11:59 a.m. UTC | #1

On Fri, Aug 06, 2021 at 12:31:04PM +0100, Will Deacon wrote:
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 75beffe2ee8a..e9c30859f80c 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -27,11 +27,32 @@ typedef struct {
>  } mm_context_t;
>  
>  /*
> - * This macro is only used by the TLBI and low-level switch_mm() code,
> - * neither of which can race with an ASID change. We therefore don't
> - * need to reload the counter using atomic64_read().
> + * We use atomic64_read() here because the ASID for an 'mm_struct' can
> + * be reallocated when scheduling one of its threads following a
> + * rollover event (see new_context() and flush_context()). In this case,
> + * a concurrent TLBI (e.g. via try_to_unmap_one() and ptep_clear_flush())
> + * may use a stale ASID. This is fine in principle as the new ASID is
> + * guaranteed to be clean in the TLB, but the TLBI routines have to take
> + * care to handle the following race:
> + *
> + *    CPU 0                    CPU 1                          CPU 2
> + *
> + *    // ptep_clear_flush(mm)
> + *    xchg_relaxed(pte, 0)
> + *    DSB ISHST
> + *    old = ASID(mm)

We'd need specs clarified (ARM ARM, cat model) that the DSB ISHST is
sufficient to order the pte write with the subsequent ASID read.
Otherwise the patch looks fine to me:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Will Deacon Aug. 6, 2021, 12:42 p.m. UTC | #2

On Fri, Aug 06, 2021 at 12:59:28PM +0100, Catalin Marinas wrote:
> On Fri, Aug 06, 2021 at 12:31:04PM +0100, Will Deacon wrote:
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index 75beffe2ee8a..e9c30859f80c 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -27,11 +27,32 @@ typedef struct {
> >  } mm_context_t;
> >  
> >  /*
> > - * This macro is only used by the TLBI and low-level switch_mm() code,
> > - * neither of which can race with an ASID change. We therefore don't
> > - * need to reload the counter using atomic64_read().
> > + * We use atomic64_read() here because the ASID for an 'mm_struct' can
> > + * be reallocated when scheduling one of its threads following a
> > + * rollover event (see new_context() and flush_context()). In this case,
> > + * a concurrent TLBI (e.g. via try_to_unmap_one() and ptep_clear_flush())
> > + * may use a stale ASID. This is fine in principle as the new ASID is
> > + * guaranteed to be clean in the TLB, but the TLBI routines have to take
> > + * care to handle the following race:
> > + *
> > + *    CPU 0                    CPU 1                          CPU 2
> > + *
> > + *    // ptep_clear_flush(mm)
> > + *    xchg_relaxed(pte, 0)
> > + *    DSB ISHST
> > + *    old = ASID(mm)
> 
> We'd need specs clarified (ARM ARM, cat model) that the DSB ISHST is
> sufficient to order the pte write with the subsequent ASID read.

Although I agree that the cat model needs updating and also that the Arm
ARM isn't helpful by trying to define DMB and DSB at the same time, it
does clearly state the following:

  // B2-149
  | A DSB instruction executed by a PE, PEe, completes when all of the
  | following apply:
  |
  | * All explicit memory accesses of the required access types appearing
  |   in program order before the DSB are complete for the set of observers
  |   in the required shareability domain.

  [...]

  // B2-150
  | In addition, no instruction that appears in program order after the
  | DSB instruction can alter any state of the system or perform any part
  | of its functionality until the DSB completes other than:
  |
  | * Being fetched from memory and decoded.
  | * Reading the general-purpose, SIMD and floating-point, Special-purpose,
  |   or System registers that are directly or indirectly read without
  |   causing side-effects.

Which means that the ASID read cannot return its data before the DSB ISHST
has completed and the DSB ISHST cannot complete until the PTE write has
completed.

> Otherwise the patch looks fine to me:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks! Do you want to queue it for 5.15? I don't think there's a need to
rush it into 5.14 given that we don't have any evidence of it happening
in practice.

Will

Catalin Marinas Aug. 6, 2021, 12:49 p.m. UTC | #3

On Fri, Aug 06, 2021 at 01:42:42PM +0100, Will Deacon wrote:
> On Fri, Aug 06, 2021 at 12:59:28PM +0100, Catalin Marinas wrote:
> > On Fri, Aug 06, 2021 at 12:31:04PM +0100, Will Deacon wrote:
> > > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > > index 75beffe2ee8a..e9c30859f80c 100644
> > > --- a/arch/arm64/include/asm/mmu.h
> > > +++ b/arch/arm64/include/asm/mmu.h
> > > @@ -27,11 +27,32 @@ typedef struct {
> > >  } mm_context_t;
> > >  
> > >  /*
> > > - * This macro is only used by the TLBI and low-level switch_mm() code,
> > > - * neither of which can race with an ASID change. We therefore don't
> > > - * need to reload the counter using atomic64_read().
> > > + * We use atomic64_read() here because the ASID for an 'mm_struct' can
> > > + * be reallocated when scheduling one of its threads following a
> > > + * rollover event (see new_context() and flush_context()). In this case,
> > > + * a concurrent TLBI (e.g. via try_to_unmap_one() and ptep_clear_flush())
> > > + * may use a stale ASID. This is fine in principle as the new ASID is
> > > + * guaranteed to be clean in the TLB, but the TLBI routines have to take
> > > + * care to handle the following race:
> > > + *
> > > + *    CPU 0                    CPU 1                          CPU 2
> > > + *
> > > + *    // ptep_clear_flush(mm)
> > > + *    xchg_relaxed(pte, 0)
> > > + *    DSB ISHST
> > > + *    old = ASID(mm)
> > 
> > We'd need specs clarified (ARM ARM, cat model) that the DSB ISHST is
> > sufficient to order the pte write with the subsequent ASID read.
> 
> Although I agree that the cat model needs updating and also that the Arm
> ARM isn't helpful by trying to define DMB and DSB at the same time, it
> does clearly state the following:
> 
>   // B2-149
>   | A DSB instruction executed by a PE, PEe, completes when all of the
>   | following apply:
>   |
>   | * All explicit memory accesses of the required access types appearing
>   |   in program order before the DSB are complete for the set of observers
>   |   in the required shareability domain.
> 
>   [...]
> 
>   // B2-150
>   | In addition, no instruction that appears in program order after the
>   | DSB instruction can alter any state of the system or perform any part
>   | of its functionality until the DSB completes other than:
>   |
>   | * Being fetched from memory and decoded.
>   | * Reading the general-purpose, SIMD and floating-point, Special-purpose,
>   |   or System registers that are directly or indirectly read without
>   |   causing side-effects.
> 
> Which means that the ASID read cannot return its data before the DSB ISHST
> has completed and the DSB ISHST cannot complete until the PTE write has
> completed.

Thanks for the explanation.

> > Otherwise the patch looks fine to me:
> > 
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> Thanks! Do you want to queue it for 5.15? I don't think there's a need to
> rush it into 5.14 given that we don't have any evidence of it happening
> in practice.

Happy to queue it for 5.15.

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 75beffe2ee8a..e9c30859f80c 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -27,11 +27,32 @@  typedef struct {
 } mm_context_t;
 
 /*
- * This macro is only used by the TLBI and low-level switch_mm() code,
- * neither of which can race with an ASID change. We therefore don't
- * need to reload the counter using atomic64_read().
+ * We use atomic64_read() here because the ASID for an 'mm_struct' can
+ * be reallocated when scheduling one of its threads following a
+ * rollover event (see new_context() and flush_context()). In this case,
+ * a concurrent TLBI (e.g. via try_to_unmap_one() and ptep_clear_flush())
+ * may use a stale ASID. This is fine in principle as the new ASID is
+ * guaranteed to be clean in the TLB, but the TLBI routines have to take
+ * care to handle the following race:
+ *
+ *    CPU 0                    CPU 1                          CPU 2
+ *
+ *    // ptep_clear_flush(mm)
+ *    xchg_relaxed(pte, 0)
+ *    DSB ISHST
+ *    old = ASID(mm)
+ *         |                                                  <rollover>
+ *         |                   new = new_context(mm)
+ *         \-----------------> atomic_set(mm->context.id, new)
+ *                             cpu_switch_mm(mm)
+ *                             // Hardware walk of pte using new ASID
+ *    TLBI(old)
+ *
+ * In this scenario, the barrier on CPU 0 and the dependency on CPU 1
+ * ensure that the page-table walker on CPU 1 *must* see the invalid PTE
+ * written by CPU 0.
  */
-#define ASID(mm)	((mm)->context.id.counter & 0xffff)
+#define ASID(mm)	(atomic64_read(&(mm)->context.id) & 0xffff)
 
 static inline bool arm64_kernel_unmapped_at_el0(void)
 {
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index cc3f5a33ff9c..36f02892e1df 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -245,9 +245,10 @@  static inline void flush_tlb_all(void)
 
 static inline void flush_tlb_mm(struct mm_struct *mm)
 {
-	unsigned long asid = __TLBI_VADDR(0, ASID(mm));
+	unsigned long asid;
 
 	dsb(ishst);
+	asid = __TLBI_VADDR(0, ASID(mm));
 	__tlbi(aside1is, asid);
 	__tlbi_user(aside1is, asid);
 	dsb(ish);
@@ -256,9 +257,10 @@  static inline void flush_tlb_mm(struct mm_struct *mm)
 static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
 					 unsigned long uaddr)
 {
-	unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm));
+	unsigned long addr;
 
 	dsb(ishst);
+	addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm));
 	__tlbi(vale1is, addr);
 	__tlbi_user(vale1is, addr);
 }
@@ -283,9 +285,7 @@  static inline void __flush_tlb_range(struct vm_area_struct *vma,
 {
 	int num = 0;
 	int scale = 0;
-	unsigned long asid = ASID(vma->vm_mm);
-	unsigned long addr;
-	unsigned long pages;
+	unsigned long asid, addr, pages;
 
 	start = round_down(start, stride);
 	end = round_up(end, stride);
@@ -305,6 +305,7 @@  static inline void __flush_tlb_range(struct vm_area_struct *vma,
 	}
 
 	dsb(ishst);
+	asid = ASID(vma->vm_mm);
 
 	/*
 	 * When the CPU does not support TLB range operations, flush the TLB

[1/4] arm64: mm: Fix TLBI vs ASID rollover

Commit Message

Comments

Patch