Message ID | 1535125966-7666-4-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Avoid synchronous TLB invalidation for intermediate page-table entries on arm64 | expand |
On Fri, Aug 24, 2018 at 8:52 AM Will Deacon <will.deacon@arm.com> wrote: > > Now that our walk-cache invalidation routines imply a DSB before the > invalidation, we no longer need one when we are clearing an entry during > unmap. Do you really still need it when *setting* it? I'm wondering if you could just remove the thing unconditionally. Why would you need a barrier for another CPU for a mapping that is just being created? It's ok if they see the old lack of mapping until they are told about it, and that eventual "being told about it" must involve a data transfer already. And I'm assuming arm doesn't cache negative page table entries, so there's no issue with any stale tlb. And any other kernel thread looking at the page tables will have to honor the page table locking, so you don't need it for some direct page table lookup either. Hmm? It seems like you shouldn't need to order the "set page directory entry" with anything. But maybe there's some magic arm64 rule I'm not aware of. Maybe even the local TLB hardware walker isn't coherent with local stores? Linus
Hi Linus, On Fri, Aug 24, 2018 at 09:15:17AM -0700, Linus Torvalds wrote: > On Fri, Aug 24, 2018 at 8:52 AM Will Deacon <will.deacon@arm.com> wrote: > > > > Now that our walk-cache invalidation routines imply a DSB before the > > invalidation, we no longer need one when we are clearing an entry during > > unmap. > > Do you really still need it when *setting* it? > > I'm wondering if you could just remove the thing unconditionally. > > Why would you need a barrier for another CPU for a mapping that is > just being created? It's ok if they see the old lack of mapping until > they are told about it, and that eventual "being told about it" must > involve a data transfer already. > > And I'm assuming arm doesn't cache negative page table entries, so > there's no issue with any stale tlb. > > And any other kernel thread looking at the page tables will have to > honor the page table locking, so you don't need it for some direct > page table lookup either. > > Hmm? It seems like you shouldn't need to order the "set page directory > entry" with anything. > > But maybe there's some magic arm64 rule I'm not aware of. Maybe even > the local TLB hardware walker isn't coherent with local stores? Yup, you got it: it's not related to ordering of accesses by other CPUs, but actually because the page-table walker is treated as a separate observer by the architecture and therefore we need the DSB to push out the store to the page-table so that the walker can see it (practically speaking, the walker isn't guaranteed to snoop the store buffer). For PTEs mapping user addresses, we actually don't bother with the DSB when writing a valid entry because it's extremely unlikely that we'd get back to userspace with the entry sitting in the store buffer. If that *did* happen, we'd just take the fault a second time. However, if we played that same trick for pXds, I think that: (a) We'd need to distinguish between user and kernel mappings in set_pXd(), since we can't tolerate spurious faults on kernel addresses. (b) We'd need to be careful about allocating page-table pages, so that e.g. the walker sees zeroes for a new pgtable We could probably achieve (a) with a software bit and (b) is a non-issue because mm/memory.c uses smp_wmb(), which is always a DMB for us (which will enforce the eventual ordering but doesn't necessarily publish the stores immediately). Will
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 1bdeca8918a6..2ab2031b778c 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -360,6 +360,7 @@ static inline int pmd_protnone(pmd_t pmd) #define pmd_present(pmd) pte_present(pmd_pte(pmd)) #define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd)) #define pmd_young(pmd) pte_young(pmd_pte(pmd)) +#define pmd_valid(pmd) pte_valid(pmd_pte(pmd)) #define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd))) #define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd))) #define pmd_mkwrite(pmd) pte_pmd(pte_mkwrite(pmd_pte(pmd))) @@ -431,7 +432,9 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { WRITE_ONCE(*pmdp, pmd); - dsb(ishst); + + if (pmd_valid(pmd)) + dsb(ishst); } static inline void pmd_clear(pmd_t *pmdp) @@ -477,11 +480,14 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd) #define pud_none(pud) (!pud_val(pud)) #define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT)) #define pud_present(pud) pte_present(pud_pte(pud)) +#define pud_valid(pud) pte_valid(pud_pte(pud)) static inline void set_pud(pud_t *pudp, pud_t pud) { WRITE_ONCE(*pudp, pud); - dsb(ishst); + + if (pud_valid(pud)) + dsb(ishst); } static inline void pud_clear(pud_t *pudp)
Now that our walk-cache invalidation routines imply a DSB before the invalidation, we no longer need one when we are clearing an entry during unmap. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm64/include/asm/pgtable.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)