diff mbox

arm64: mm: Fix false positives in set_pte_at access/dirty race detection

Message ID 1513079030-8557-1-git-send-email-will.deacon@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Will Deacon Dec. 12, 2017, 11:43 a.m. UTC
Jiankang reports that our race detection in set_pte_at is firing when
copying the page tables in dup_mmap as a result of a fork(). In this
situation, the page table isn't actually live and so there is no way
that we can race with a concurrent update from the hardware page table
walker.

This patch reworks the race detection so that we require either the
mm to match the current active_mm (i.e. currently installed in our TTBR0)
or the mm_users count to be greater than 1, implying that the page table
could be live in another CPU. The mm_users check might still be racy,
but we'll avoid false positives and it's not realistic to validate that
all the necessary locks are held as part of this assertion.

Cc: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Jiankang Chen <chenjiankang1@huawei.com>
Tested-by: Jiankang Chen <chenjiankang1@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Xie Yisheng Dec. 13, 2017, 1:01 a.m. UTC | #1
Hi Will,

On 2017/12/12 19:43, Will Deacon wrote:
> Jiankang reports that our race detection in set_pte_at is firing when
> copying the page tables in dup_mmap as a result of a fork(). In this
> situation, the page table isn't actually live and so there is no way
> that we can race with a concurrent update from the hardware page table
> walker.
> 
> This patch reworks the race detection so that we require either the
> mm to match the current active_mm (i.e. currently installed in our TTBR0)
> or the mm_users count to be greater than 1, implying that the page table
> could be live in another CPU. The mm_users check might still be racy,
> but we'll avoid false positives and it's not realistic to validate that
> all the necessary locks are held as part of this assertion.
> 
> Cc: Yisheng Xie <xieyisheng1@huawei.com>
> Reported-by: Jiankang Chen <chenjiankang1@huawei.com>
> Tested-by: Jiankang Chen <chenjiankang1@huawei.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/include/asm/pgtable.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 3ff03a755c32..bdcc7f1c9d06 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -42,6 +42,8 @@
>  #include <asm/cmpxchg.h>
>  #include <asm/fixmap.h>
>  #include <linux/mmdebug.h>
> +#include <linux/mm_types.h>
> +#include <linux/sched.h>
>  

Do you have compiled kernel after apply this patch? In our environment, it will
fail to compile kernel if include file here(I will attach some log later).
Instead, we move these included file after mmdebug.h, and I do not know whether
this is just my compiler's problem:
  --- a/arch/arm64/include/asm/pgtable.h
  +++ b/arch/arm64/include/asm/pgtable.h
  @@ -42,6 +42,8 @@
   #include <asm/cmpxchg.h>
   #include <asm/fixmap.h>
   #include <linux/mmdebug.h>
   +#include <linux/mm_types.h>
   +#include <linux/sched.h>

Sorry for not having told you this information.

Thanks
Yisheng Xie

compiler err log: ==========
[...]
include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{'
include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)'
include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page'
include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{'
include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset'
include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)'
include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size'
[...]
Will Deacon Dec. 13, 2017, 9:21 a.m. UTC | #2
Hi Yisheng,

On Wed, Dec 13, 2017 at 09:01:23AM +0800, Yisheng Xie wrote:
> On 2017/12/12 19:43, Will Deacon wrote:
> > Jiankang reports that our race detection in set_pte_at is firing when
> > copying the page tables in dup_mmap as a result of a fork(). In this
> > situation, the page table isn't actually live and so there is no way
> > that we can race with a concurrent update from the hardware page table
> > walker.
> > 
> > This patch reworks the race detection so that we require either the
> > mm to match the current active_mm (i.e. currently installed in our TTBR0)
> > or the mm_users count to be greater than 1, implying that the page table
> > could be live in another CPU. The mm_users check might still be racy,
> > but we'll avoid false positives and it's not realistic to validate that
> > all the necessary locks are held as part of this assertion.
> > 
> > Cc: Yisheng Xie <xieyisheng1@huawei.com>
> > Reported-by: Jiankang Chen <chenjiankang1@huawei.com>
> > Tested-by: Jiankang Chen <chenjiankang1@huawei.com>
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm64/include/asm/pgtable.h | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index 3ff03a755c32..bdcc7f1c9d06 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -42,6 +42,8 @@
> >  #include <asm/cmpxchg.h>
> >  #include <asm/fixmap.h>
> >  #include <linux/mmdebug.h>
> > +#include <linux/mm_types.h>
> > +#include <linux/sched.h>
> >  
> 
> Do you have compiled kernel after apply this patch? In our environment, it will
> fail to compile kernel if include file here(I will attach some log later).
> Instead, we move these included file after mmdebug.h, and I do not know whether
> this is just my compiler's problem:

It compiles fine for me. Are you seeing a problem building this on top of
mainline? If so, what is your .config?

>   --- a/arch/arm64/include/asm/pgtable.h
>   +++ b/arch/arm64/include/asm/pgtable.h
>   @@ -42,6 +42,8 @@
>    #include <asm/cmpxchg.h>
>    #include <asm/fixmap.h>
>    #include <linux/mmdebug.h>
>    +#include <linux/mm_types.h>
>    +#include <linux/sched.h>
> 
> Sorry for not having told you this information.
> 
> Thanks
> Yisheng Xie
> 
> compiler err log: ==========
> [...]
> include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{'
> include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)'
> include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page'
> include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{'
> include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset'
> include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)'
> include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size'
> [...]

This looks like the includes are being pulled into an assembly file, but
this part is guarded by #ifndef __ASSEMBLY__ so I can't see how that could
happen.

Will
Xie Yisheng Dec. 13, 2017, 9:42 a.m. UTC | #3
Hi Will,

On 2017/12/13 17:21, Will Deacon wrote:
> Hi Yisheng,
> 
> On Wed, Dec 13, 2017 at 09:01:23AM +0800, Yisheng Xie wrote:
>> On 2017/12/12 19:43, Will Deacon wrote:
>>> Jiankang reports that our race detection in set_pte_at is firing when
>>> copying the page tables in dup_mmap as a result of a fork(). In this
>>> situation, the page table isn't actually live and so there is no way
>>> that we can race with a concurrent update from the hardware page table
>>> walker.
>>>
>>> This patch reworks the race detection so that we require either the
>>> mm to match the current active_mm (i.e. currently installed in our TTBR0)
>>> or the mm_users count to be greater than 1, implying that the page table
>>> could be live in another CPU. The mm_users check might still be racy,
>>> but we'll avoid false positives and it's not realistic to validate that
>>> all the necessary locks are held as part of this assertion.
>>>
>>> Cc: Yisheng Xie <xieyisheng1@huawei.com>
>>> Reported-by: Jiankang Chen <chenjiankang1@huawei.com>
>>> Tested-by: Jiankang Chen <chenjiankang1@huawei.com>
>>> Signed-off-by: Will Deacon <will.deacon@arm.com>
>>> ---
>>>  arch/arm64/include/asm/pgtable.h | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>>> index 3ff03a755c32..bdcc7f1c9d06 100644
>>> --- a/arch/arm64/include/asm/pgtable.h
>>> +++ b/arch/arm64/include/asm/pgtable.h
>>> @@ -42,6 +42,8 @@
>>>  #include <asm/cmpxchg.h>
>>>  #include <asm/fixmap.h>
>>>  #include <linux/mmdebug.h>
>>> +#include <linux/mm_types.h>
>>> +#include <linux/sched.h>
>>>  
>>
>> Do you have compiled kernel after apply this patch? In our environment, it will
>> fail to compile kernel if include file here(I will attach some log later).
>> Instead, we move these included file after mmdebug.h, and I do not know whether
>> this is just my compiler's problem:
> 
> It compiles fine for me. Are you seeing a problem building this on top of
> mainline? If so, what is your .config?

Oh, you have already moved it after #ifndef __ASSEMBLY__ , sorry for not finding
this when review this patch.

And I also have just compiled with this patch in mainline, is ok.
Thanks for your help.

Thanks
Yisheng Xie

> 
>>   --- a/arch/arm64/include/asm/pgtable.h
>>   +++ b/arch/arm64/include/asm/pgtable.h
>>   @@ -42,6 +42,8 @@
>>    #include <asm/cmpxchg.h>
>>    #include <asm/fixmap.h>
>>    #include <linux/mmdebug.h>
>>    +#include <linux/mm_types.h>
>>    +#include <linux/sched.h>
>>
>> Sorry for not having told you this information.
>>
>> Thanks
>> Yisheng Xie
>>
>> compiler err log: ==========
>> [...]
>> include/linux/mm_types_task.h:60: Error: unknown mnemonic `struct' -- `struct page_frag{'
>> include/asm-generic/preempt.h:9: Error: unknown mnemonic `static' -- `static inline int preempt_count(void)'
>> include/linux/mm_types_task.h:61: Error: unknown mnemonic `struct' -- `struct page*page'
>> include/asm-generic/preempt.h:10: Error: junk at end of line, first unrecognized character is `{'
>> include/linux/mm_types_task.h:63: Error: unknown mnemonic `__u32' -- `__u32 offset'
>> include/asm-generic/preempt.h:11: Error: unknown mnemonic `return' -- `return READ_ONCE(((struct thread_info*)current)->preempt_count)'
>> include/linux/mm_types_task.h:64: Error: unknown mnemonic `__u32' -- `__u32 size'
>> [...]
> 
> This looks like the includes are being pulled into an assembly file, but
> this part is guarded by #ifndef __ASSEMBLY__ so I can't see how that could
> happen.
> 
> Will
> 
> .
>
diff mbox

Patch

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3ff03a755c32..bdcc7f1c9d06 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -42,6 +42,8 @@ 
 #include <asm/cmpxchg.h>
 #include <asm/fixmap.h>
 #include <linux/mmdebug.h>
+#include <linux/mm_types.h>
+#include <linux/sched.h>
 
 extern void __pte_error(const char *file, int line, unsigned long val);
 extern void __pmd_error(const char *file, int line, unsigned long val);
@@ -215,9 +217,6 @@  static inline void set_pte(pte_t *ptep, pte_t pte)
 	}
 }
 
-struct mm_struct;
-struct vm_area_struct;
-
 extern void __sync_icache_dcache(pte_t pteval, unsigned long addr);
 
 /*
@@ -246,7 +245,8 @@  static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 	 * hardware updates of the pte (ptep_set_access_flags safely changes
 	 * valid ptes without going through an invalid entry).
 	 */
-	if (pte_valid(*ptep) && pte_valid(pte)) {
+	if (IS_ENABLED(CONFIG_DEBUG_VM) && pte_valid(*ptep) && pte_valid(pte) &&
+	   (mm == current->active_mm || atomic_read(&mm->mm_users) > 1)) {
 		VM_WARN_ONCE(!pte_young(pte),
 			     "%s: racy access flag clearing: 0x%016llx -> 0x%016llx",
 			     __func__, pte_val(*ptep), pte_val(pte));