Message ID | 20221212095523.52683-2-julien@xen.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/arm: Don't switch TTBR while the MMU is on | expand |
Hi Julien, On 12/12/2022 10:55, Julien Grall wrote: > > > From: Julien Grall <jgrall@amazon.com> > > Per D5-4929 in ARM DDI 0487H.a: > "A DSB NSH is sufficient to ensure completion of TLB maintenance > instructions that apply to a single PE. A DSB ISH is sufficient to > ensure completion of TLB maintenance instructions that apply to PEs > in the same Inner Shareable domain. > " > > This means barrier after local TLB flushes could be reduced to > non-shareable. > > Note that the scope of the barrier in the workaround has not been > changed because Linux v6.1-rc8 is also using 'ish' and I couldn't > find anything in the Neoverse N1 suggesting that a 'nsh' would > be sufficient. > > Signed-off-by: Julien Grall <jgrall@amazon.com> > > --- > > I have used an older version of the Arm Arm because the explanation > in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph > about DSB in D8.13.8 is missing the shareability. But this is implied > in B2.3.11: > > "If the required access types of the DSB is reads and writes, the > following instructions issued by PEe before the DSB are complete for > the required shareability domain: > > [...] > > — All TLB maintenance instructions. > " > > Changes in v3: > - Patch added > --- > xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++--------- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h > index 7c5431518741..39d429ace552 100644 > --- a/xen/arch/arm/include/asm/arm64/flushtlb.h > +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h > @@ -12,8 +12,9 @@ > * ARM64_WORKAROUND_REPEAT_TLBI: Before this line, in the same comment, we state DSB ISHST. This should also be changed to reflect the change done by this patch. > * Modification of the translation table for a virtual address might lead to > * read-after-read ordering violation. > - * The workaround repeats TLBI+DSB operation for all the TLB flush operations. > - * While this is stricly not necessary, we don't want to take any risk. > + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush > + * operations. While this is stricly not necessary, we don't want to s/stricly/strictly/ > + * take any risk. > * > * For Xen page-tables the ISB will discard any instructions fetched > * from the old mappings. > @@ -21,38 +22,42 @@ > * For the Stage-2 page-tables the ISB ensures the completion of the DSB > * (and therefore the TLB invalidation) before continuing. So we know > * the TLBs cannot contain an entry for a mapping we may have removed. > + * > + * Note that for local TLB flush, using non-shareable (nsh) is sufficient > + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in s/Althougth/Although/ > + * for the workaround is left as inner-shareable to match with Linux. So for the workaround we stay with DSB ISH. But ... > */ > -#define TLB_HELPER(name, tlbop) \ > +#define TLB_HELPER(name, tlbop, sh) \ > static inline void name(void) \ > { \ > asm volatile( \ > - "dsb ishst;" \ > + "dsb " # sh "st;" \ > "tlbi " # tlbop ";" \ > ALTERNATIVE( \ > "nop; nop;", \ > - "dsb ish;" \ > + "dsb " # sh ";" \ ... you do not adhere to this. > "tlbi " # tlbop ";", \ > ARM64_WORKAROUND_REPEAT_TLBI, \ > CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ > - "dsb ish;" \ > + "dsb " # sh ";" \ > "isb;" \ > : : : "memory"); \ > } > > /* Flush local TLBs, current VMID only. */ > -TLB_HELPER(flush_guest_tlb_local, vmalls12e1); > +TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh); > > /* Flush innershareable TLBs, current VMID only */ > -TLB_HELPER(flush_guest_tlb, vmalls12e1is); > +TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish); > > /* Flush local TLBs, all VMIDs, non-hypervisor mode */ > -TLB_HELPER(flush_all_guests_tlb_local, alle1); > +TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh); > > /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */ > -TLB_HELPER(flush_all_guests_tlb, alle1is); > +TLB_HELPER(flush_all_guests_tlb, alle1is, ish); > > /* Flush all hypervisor mappings from the TLB of the local processor. */ > -TLB_HELPER(flush_xen_tlb_local, alle2); > +TLB_HELPER(flush_xen_tlb_local, alle2, nsh); > > /* Flush TLB of local processor for address va. */ > static inline void __flush_xen_tlb_one_local(vaddr_t va) > -- > 2.38.1 > With the remarks fixed: Reviewed-by: Michal Orzel <michal.orzel@amd.com> ~Michal
On 13/12/2022 09:11, Michal Orzel wrote: > Hi Julien, Hi Michal, > On 12/12/2022 10:55, Julien Grall wrote: >> >> >> From: Julien Grall <jgrall@amazon.com> >> >> Per D5-4929 in ARM DDI 0487H.a: >> "A DSB NSH is sufficient to ensure completion of TLB maintenance >> instructions that apply to a single PE. A DSB ISH is sufficient to >> ensure completion of TLB maintenance instructions that apply to PEs >> in the same Inner Shareable domain. >> " >> >> This means barrier after local TLB flushes could be reduced to >> non-shareable. >> >> Note that the scope of the barrier in the workaround has not been >> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't >> find anything in the Neoverse N1 suggesting that a 'nsh' would >> be sufficient. >> >> Signed-off-by: Julien Grall <jgrall@amazon.com> >> >> --- >> >> I have used an older version of the Arm Arm because the explanation >> in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph >> about DSB in D8.13.8 is missing the shareability. But this is implied >> in B2.3.11: >> >> "If the required access types of the DSB is reads and writes, the >> following instructions issued by PEe before the DSB are complete for >> the required shareability domain: >> >> [...] >> >> — All TLB maintenance instructions. >> " >> >> Changes in v3: >> - Patch added >> --- >> xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++--------- >> 1 file changed, 16 insertions(+), 11 deletions(-) >> >> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h >> index 7c5431518741..39d429ace552 100644 >> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h >> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h >> @@ -12,8 +12,9 @@ >> * ARM64_WORKAROUND_REPEAT_TLBI: > Before this line, in the same comment, we state DSB ISHST. This should also be changed > to reflect the change done by this patch. This is on purpose. I decided to keep the sequence as-is and instead add a paragraph explaining that 'nsh' is sufficient for local TLB flushes. > >> * Modification of the translation table for a virtual address might lead to >> * read-after-read ordering violation. >> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations. >> - * While this is stricly not necessary, we don't want to take any risk. >> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush >> + * operations. While this is stricly not necessary, we don't want to > s/stricly/strictly/ > >> + * take any risk. >> * >> * For Xen page-tables the ISB will discard any instructions fetched >> * from the old mappings. >> @@ -21,38 +22,42 @@ >> * For the Stage-2 page-tables the ISB ensures the completion of the DSB >> * (and therefore the TLB invalidation) before continuing. So we know >> * the TLBs cannot contain an entry for a mapping we may have removed. >> + * >> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient >> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in > s/Althougth/Although/ > >> + * for the workaround is left as inner-shareable to match with Linux. > So for the workaround we stay with DSB ISH. But ... > >> */ >> -#define TLB_HELPER(name, tlbop) \ >> +#define TLB_HELPER(name, tlbop, sh) \ >> static inline void name(void) \ >> { \ >> asm volatile( \ >> - "dsb ishst;" \ >> + "dsb " # sh "st;" \ >> "tlbi " # tlbop ";" \ >> ALTERNATIVE( \ >> "nop; nop;", \ >> - "dsb ish;" \ >> + "dsb " # sh ";" \ > ... you do not adhere to this. This is a leftover from my previous approach. I will drop it. [...] > > With the remarks fixed: > Reviewed-by: Michal Orzel <michal.orzel@amd.com> I am not planning to fix the first remark. Please let me know if your Reviewed-by tag stands. Cheers,
Hi Julien, On 13/12/2022 10:45, Julien Grall wrote: > > > On 13/12/2022 09:11, Michal Orzel wrote: >> Hi Julien, > > Hi Michal, > >> On 12/12/2022 10:55, Julien Grall wrote: >>> >>> >>> From: Julien Grall <jgrall@amazon.com> >>> >>> Per D5-4929 in ARM DDI 0487H.a: >>> "A DSB NSH is sufficient to ensure completion of TLB maintenance >>> instructions that apply to a single PE. A DSB ISH is sufficient to >>> ensure completion of TLB maintenance instructions that apply to PEs >>> in the same Inner Shareable domain. >>> " >>> >>> This means barrier after local TLB flushes could be reduced to >>> non-shareable. >>> >>> Note that the scope of the barrier in the workaround has not been >>> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't >>> find anything in the Neoverse N1 suggesting that a 'nsh' would >>> be sufficient. >>> >>> Signed-off-by: Julien Grall <jgrall@amazon.com> >>> >>> --- >>> >>> I have used an older version of the Arm Arm because the explanation >>> in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph >>> about DSB in D8.13.8 is missing the shareability. But this is implied >>> in B2.3.11: >>> >>> "If the required access types of the DSB is reads and writes, the >>> following instructions issued by PEe before the DSB are complete for >>> the required shareability domain: >>> >>> [...] >>> >>> — All TLB maintenance instructions. >>> " >>> >>> Changes in v3: >>> - Patch added >>> --- >>> xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++--------- >>> 1 file changed, 16 insertions(+), 11 deletions(-) >>> >>> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h >>> index 7c5431518741..39d429ace552 100644 >>> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h >>> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h >>> @@ -12,8 +12,9 @@ >>> * ARM64_WORKAROUND_REPEAT_TLBI: >> Before this line, in the same comment, we state DSB ISHST. This should also be changed >> to reflect the change done by this patch. > > This is on purpose. I decided to keep the sequence as-is and instead add > a paragraph explaining that 'nsh' is sufficient for local TLB flushes. > >> >>> * Modification of the translation table for a virtual address might lead to >>> * read-after-read ordering violation. >>> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations. >>> - * While this is stricly not necessary, we don't want to take any risk. >>> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush >>> + * operations. While this is stricly not necessary, we don't want to >> s/stricly/strictly/ >> >>> + * take any risk. >>> * >>> * For Xen page-tables the ISB will discard any instructions fetched >>> * from the old mappings. >>> @@ -21,38 +22,42 @@ >>> * For the Stage-2 page-tables the ISB ensures the completion of the DSB >>> * (and therefore the TLB invalidation) before continuing. So we know >>> * the TLBs cannot contain an entry for a mapping we may have removed. >>> + * >>> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient >>> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in >> s/Althougth/Although/ >> >>> + * for the workaround is left as inner-shareable to match with Linux. >> So for the workaround we stay with DSB ISH. But ... >> >>> */ >>> -#define TLB_HELPER(name, tlbop) \ >>> +#define TLB_HELPER(name, tlbop, sh) \ >>> static inline void name(void) \ >>> { \ >>> asm volatile( \ >>> - "dsb ishst;" \ >>> + "dsb " # sh "st;" \ >>> "tlbi " # tlbop ";" \ >>> ALTERNATIVE( \ >>> "nop; nop;", \ >>> - "dsb ish;" \ >>> + "dsb " # sh ";" \ >> ... you do not adhere to this. > > This is a leftover from my previous approach. I will drop it. > > [...] > >> >> With the remarks fixed: >> Reviewed-by: Michal Orzel <michal.orzel@amd.com> > > I am not planning to fix the first remark. Please let me know if your > Reviewed-by tag stands. I'm ok with it so you can keep my tag. > > Cheers, > > -- > Julien Grall ~Michal
On 12/12/2022 09:55, Julien Grall wrote: > CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email. > > > From: Julien Grall <jgrall@amazon.com> > > Per D5-4929 in ARM DDI 0487H.a: > "A DSB NSH is sufficient to ensure completion of TLB maintenance > instructions that apply to a single PE. A DSB ISH is sufficient to > ensure completion of TLB maintenance instructions that apply to PEs > in the same Inner Shareable domain. > " > > This means barrier after local TLB flushes could be reduced to > non-shareable. > > Note that the scope of the barrier in the workaround has not been > changed because Linux v6.1-rc8 is also using 'ish' and I couldn't > find anything in the Neoverse N1 suggesting that a 'nsh' would > be sufficient. > > Signed-off-by: Julien Grall <jgrall@amazon.com> > > --- > > I have used an older version of the Arm Arm because the explanation > in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph > about DSB in D8.13.8 is missing the shareability. But this is implied > in B2.3.11: > > "If the required access types of the DSB is reads and writes, the > following instructions issued by PEe before the DSB are complete for > the required shareability domain: > > [...] > > — All TLB maintenance instructions. > " > > Changes in v3: > - Patch added > --- > xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++--------- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h > index 7c5431518741..39d429ace552 100644 > --- a/xen/arch/arm/include/asm/arm64/flushtlb.h > +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h > @@ -12,8 +12,9 @@ > * ARM64_WORKAROUND_REPEAT_TLBI: > * Modification of the translation table for a virtual address might lead to > * read-after-read ordering violation. > - * The workaround repeats TLBI+DSB operation for all the TLB flush operations. > - * While this is stricly not necessary, we don't want to take any risk. > + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush > + * operations. While this is stricly not necessary, we don't want to > + * take any risk. > * > * For Xen page-tables the ISB will discard any instructions fetched > * from the old mappings. > @@ -21,38 +22,42 @@ > * For the Stage-2 page-tables the ISB ensures the completion of the DSB > * (and therefore the TLB invalidation) before continuing. So we know > * the TLBs cannot contain an entry for a mapping we may have removed. > + * > + * Note that for local TLB flush, using non-shareable (nsh) is sufficient > + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in > + * for the workaround is left as inner-shareable to match with Linux. Nit:- It might be good to mention the Linux commit id. > */ > -#define TLB_HELPER(name, tlbop) \ > +#define TLB_HELPER(name, tlbop, sh) \ > static inline void name(void) \ > { \ > asm volatile( \ > - "dsb ishst;" \ > + "dsb " # sh "st;" \ > "tlbi " # tlbop ";" \ > ALTERNATIVE( \ > "nop; nop;", \ > - "dsb ish;" \ > + "dsb " # sh ";" \ > "tlbi " # tlbop ";", \ > ARM64_WORKAROUND_REPEAT_TLBI, \ > CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ > - "dsb ish;" \ > + "dsb " # sh ";" \ > "isb;" \ > : : : "memory"); \ > } > > /* Flush local TLBs, current VMID only. */ > -TLB_HELPER(flush_guest_tlb_local, vmalls12e1); > +TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh); > > /* Flush innershareable TLBs, current VMID only */ > -TLB_HELPER(flush_guest_tlb, vmalls12e1is); > +TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish); > > /* Flush local TLBs, all VMIDs, non-hypervisor mode */ > -TLB_HELPER(flush_all_guests_tlb_local, alle1); > +TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh); > > /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */ > -TLB_HELPER(flush_all_guests_tlb, alle1is); > +TLB_HELPER(flush_all_guests_tlb, alle1is, ish); > > /* Flush all hypervisor mappings from the TLB of the local processor. */ > -TLB_HELPER(flush_xen_tlb_local, alle2); > +TLB_HELPER(flush_xen_tlb_local, alle2, nsh); > > /* Flush TLB of local processor for address va. */ > static inline void __flush_xen_tlb_one_local(vaddr_t va) > -- > 2.38.1 Reviewed-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com> >
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h index 7c5431518741..39d429ace552 100644 --- a/xen/arch/arm/include/asm/arm64/flushtlb.h +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h @@ -12,8 +12,9 @@ * ARM64_WORKAROUND_REPEAT_TLBI: * Modification of the translation table for a virtual address might lead to * read-after-read ordering violation. - * The workaround repeats TLBI+DSB operation for all the TLB flush operations. - * While this is stricly not necessary, we don't want to take any risk. + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush + * operations. While this is stricly not necessary, we don't want to + * take any risk. * * For Xen page-tables the ISB will discard any instructions fetched * from the old mappings. @@ -21,38 +22,42 @@ * For the Stage-2 page-tables the ISB ensures the completion of the DSB * (and therefore the TLB invalidation) before continuing. So we know * the TLBs cannot contain an entry for a mapping we may have removed. + * + * Note that for local TLB flush, using non-shareable (nsh) is sufficient + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in + * for the workaround is left as inner-shareable to match with Linux. */ -#define TLB_HELPER(name, tlbop) \ +#define TLB_HELPER(name, tlbop, sh) \ static inline void name(void) \ { \ asm volatile( \ - "dsb ishst;" \ + "dsb " # sh "st;" \ "tlbi " # tlbop ";" \ ALTERNATIVE( \ "nop; nop;", \ - "dsb ish;" \ + "dsb " # sh ";" \ "tlbi " # tlbop ";", \ ARM64_WORKAROUND_REPEAT_TLBI, \ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ - "dsb ish;" \ + "dsb " # sh ";" \ "isb;" \ : : : "memory"); \ } /* Flush local TLBs, current VMID only. */ -TLB_HELPER(flush_guest_tlb_local, vmalls12e1); +TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh); /* Flush innershareable TLBs, current VMID only */ -TLB_HELPER(flush_guest_tlb, vmalls12e1is); +TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish); /* Flush local TLBs, all VMIDs, non-hypervisor mode */ -TLB_HELPER(flush_all_guests_tlb_local, alle1); +TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh); /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */ -TLB_HELPER(flush_all_guests_tlb, alle1is); +TLB_HELPER(flush_all_guests_tlb, alle1is, ish); /* Flush all hypervisor mappings from the TLB of the local processor. */ -TLB_HELPER(flush_xen_tlb_local, alle2); +TLB_HELPER(flush_xen_tlb_local, alle2, nsh); /* Flush TLB of local processor for address va. */ static inline void __flush_xen_tlb_one_local(vaddr_t va)