Message ID | 1513079030-8557-1-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Will, On 2017/12/12 19:43, Will Deacon wrote: > Jiankang reports that our race detection in set_pte_at is firing when > copying the page tables in dup_mmap as a result of a fork(). In this > situation, the page table isn't actually live and so there is no way > that we can race with a concurrent update from the hardware page table > walker. > > This patch reworks the race detection so that we require either the > mm to match the current active_mm (i.e. currently installed in our TTBR0) > or the mm_users count to be greater than 1, implying that the page table > could be live in another CPU. The mm_users check might still be racy, > but we'll avoid false positives and it's not realistic to validate that > all the necessary locks are held as part of this assertion. > > Cc: Yisheng Xie <xieyisheng1@huawei.com> > Reported-by: Jiankang Chen <chenjiankang1@huawei.com> > Tested-by: Jiankang Chen <chenjiankang1@huawei.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > arch/arm64/include/asm/pgtable.h | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index 3ff03a755c32..bdcc7f1c9d06 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -42,6 +42,8 @@ > #include <asm/cmpxchg.h> > #include <asm/fixmap.h> > #include <linux/mmdebug.h> > +#include <linux/mm_types.h> > +#include <linux/sched.h> > Do you have compiled kernel after apply this patch? In our environment, it will fail to compile kernel if include file here(I will attach some log later). Instead, we move these included file after mmdebug.h, and I do not know whether this is just my compiler's problem: --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -42,6 +42,8 @@ #include <asm/cmpxchg.h> #include <asm/fixmap.h> #include <linux/mmdebug.h> +#include <linux/mm_types.h> +#include <linux/sched.h> Sorry for not having told you this information. Thanks Yisheng Xie compiler err log: ========== [...] include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{' include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)' include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page' include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{' include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset' include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)' include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size' [...]
Hi Yisheng, On Wed, Dec 13, 2017 at 09:01:23AM +0800, Yisheng Xie wrote: > On 2017/12/12 19:43, Will Deacon wrote: > > Jiankang reports that our race detection in set_pte_at is firing when > > copying the page tables in dup_mmap as a result of a fork(). In this > > situation, the page table isn't actually live and so there is no way > > that we can race with a concurrent update from the hardware page table > > walker. > > > > This patch reworks the race detection so that we require either the > > mm to match the current active_mm (i.e. currently installed in our TTBR0) > > or the mm_users count to be greater than 1, implying that the page table > > could be live in another CPU. The mm_users check might still be racy, > > but we'll avoid false positives and it's not realistic to validate that > > all the necessary locks are held as part of this assertion. > > > > Cc: Yisheng Xie <xieyisheng1@huawei.com> > > Reported-by: Jiankang Chen <chenjiankang1@huawei.com> > > Tested-by: Jiankang Chen <chenjiankang1@huawei.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > arch/arm64/include/asm/pgtable.h | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > > index 3ff03a755c32..bdcc7f1c9d06 100644 > > --- a/arch/arm64/include/asm/pgtable.h > > +++ b/arch/arm64/include/asm/pgtable.h > > @@ -42,6 +42,8 @@ > > #include <asm/cmpxchg.h> > > #include <asm/fixmap.h> > > #include <linux/mmdebug.h> > > +#include <linux/mm_types.h> > > +#include <linux/sched.h> > > > > Do you have compiled kernel after apply this patch? In our environment, it will > fail to compile kernel if include file here(I will attach some log later). > Instead, we move these included file after mmdebug.h, and I do not know whether > this is just my compiler's problem: It compiles fine for me. Are you seeing a problem building this on top of mainline? If so, what is your .config? > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -42,6 +42,8 @@ > #include <asm/cmpxchg.h> > #include <asm/fixmap.h> > #include <linux/mmdebug.h> > +#include <linux/mm_types.h> > +#include <linux/sched.h> > > Sorry for not having told you this information. > > Thanks > Yisheng Xie > > compiler err log: ========== > [...] > include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{' > include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)' > include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page' > include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{' > include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset' > include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)' > include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size' > [...] This looks like the includes are being pulled into an assembly file, but this part is guarded by #ifndef __ASSEMBLY__ so I can't see how that could happen. Will
Hi Will, On 2017/12/13 17:21, Will Deacon wrote: > Hi Yisheng, > > On Wed, Dec 13, 2017 at 09:01:23AM +0800, Yisheng Xie wrote: >> On 2017/12/12 19:43, Will Deacon wrote: >>> Jiankang reports that our race detection in set_pte_at is firing when >>> copying the page tables in dup_mmap as a result of a fork(). In this >>> situation, the page table isn't actually live and so there is no way >>> that we can race with a concurrent update from the hardware page table >>> walker. >>> >>> This patch reworks the race detection so that we require either the >>> mm to match the current active_mm (i.e. currently installed in our TTBR0) >>> or the mm_users count to be greater than 1, implying that the page table >>> could be live in another CPU. The mm_users check might still be racy, >>> but we'll avoid false positives and it's not realistic to validate that >>> all the necessary locks are held as part of this assertion. >>> >>> Cc: Yisheng Xie <xieyisheng1@huawei.com> >>> Reported-by: Jiankang Chen <chenjiankang1@huawei.com> >>> Tested-by: Jiankang Chen <chenjiankang1@huawei.com> >>> Signed-off-by: Will Deacon <will.deacon@arm.com> >>> --- >>> arch/arm64/include/asm/pgtable.h | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >>> index 3ff03a755c32..bdcc7f1c9d06 100644 >>> --- a/arch/arm64/include/asm/pgtable.h >>> +++ b/arch/arm64/include/asm/pgtable.h >>> @@ -42,6 +42,8 @@ >>> #include <asm/cmpxchg.h> >>> #include <asm/fixmap.h> >>> #include <linux/mmdebug.h> >>> +#include <linux/mm_types.h> >>> +#include <linux/sched.h> >>> >> >> Do you have compiled kernel after apply this patch? In our environment, it will >> fail to compile kernel if include file here(I will attach some log later). >> Instead, we move these included file after mmdebug.h, and I do not know whether >> this is just my compiler's problem: > > It compiles fine for me. Are you seeing a problem building this on top of > mainline? If so, what is your .config? Oh, you have already moved it after #ifndef __ASSEMBLY__ , sorry for not finding this when review this patch. And I also have just compiled with this patch in mainline, is ok. Thanks for your help. Thanks Yisheng Xie > >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -42,6 +42,8 @@ >> #include <asm/cmpxchg.h> >> #include <asm/fixmap.h> >> #include <linux/mmdebug.h> >> +#include <linux/mm_types.h> >> +#include <linux/sched.h> >> >> Sorry for not having told you this information. >> >> Thanks >> Yisheng Xie >> >> compiler err log: ========== >> [...] >> include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{' >> include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)' >> include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page' >> include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{' >> include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset' >> include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)' >> include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size' >> [...] > > This looks like the includes are being pulled into an assembly file, but > this part is guarded by #ifndef __ASSEMBLY__ so I can't see how that could > happen. > > Will > > . >
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 3ff03a755c32..bdcc7f1c9d06 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -42,6 +42,8 @@ #include <asm/cmpxchg.h> #include <asm/fixmap.h> #include <linux/mmdebug.h> +#include <linux/mm_types.h> +#include <linux/sched.h> extern void __pte_error(const char *file, int line, unsigned long val); extern void __pmd_error(const char *file, int line, unsigned long val); @@ -215,9 +217,6 @@ static inline void set_pte(pte_t *ptep, pte_t pte) } } -struct mm_struct; -struct vm_area_struct; - extern void __sync_icache_dcache(pte_t pteval, unsigned long addr); /* @@ -246,7 +245,8 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, * hardware updates of the pte (ptep_set_access_flags safely changes * valid ptes without going through an invalid entry). */ - if (pte_valid(*ptep) && pte_valid(pte)) { + if (IS_ENABLED(CONFIG_DEBUG_VM) && pte_valid(*ptep) && pte_valid(pte) && + (mm == current->active_mm || atomic_read(&mm->mm_users) > 1)) { VM_WARN_ONCE(!pte_young(pte), "%s: racy access flag clearing: 0x%016llx -> 0x%016llx", __func__, pte_val(*ptep), pte_val(pte));