diff mbox series

[v3,1/7] mm: Remove BUG_ON mmap_sem not held from xxx_trans_huge_lock()

Message ID 20191002134730.40985-2-thomas_os@shipmail.org (mailing list archive)
State New, archived
Headers show
Series Emulated coherent graphics memory take 2 | expand

Commit Message

Thomas Hellström (Intel) Oct. 2, 2019, 1:47 p.m. UTC
From: Thomas Hellstrom <thellstrom@vmware.com>

The caller needs to make sure that the vma is not torn down during the
lock operation and can also use the i_mmap_rwsem for file-backed vmas.
Remove the BUG_ON. We could, as an alternative, add a test that either
vma->vm_mm->mmap_sem or vma->vm_file->f_mapping->i_mmap_rwsem are held.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 include/linux/huge_mm.h | 2 --
 1 file changed, 2 deletions(-)

Comments

Kirill A. Shutemov Oct. 3, 2019, 11:02 a.m. UTC | #1
On Wed, Oct 02, 2019 at 03:47:24PM +0200, Thomas Hellström (VMware) wrote:
> From: Thomas Hellstrom <thellstrom@vmware.com>
> 
> The caller needs to make sure that the vma is not torn down during the
> lock operation and can also use the i_mmap_rwsem for file-backed vmas.
> Remove the BUG_ON. We could, as an alternative, add a test that either
> vma->vm_mm->mmap_sem or vma->vm_file->f_mapping->i_mmap_rwsem are held.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Kirill A. Shutemov <kirill@shutemov.name>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>

The patch looks good to me:

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

But I looked at usage at pagewalk.c and it is inconsitent.  The walker
takes ptl before calling ->pud_entry(), but not for ->pmd_entry().

It should be fixed: do not take the lock before ->pud_entry(). The
callback must take care of it.

Looks like we have single ->pud_entry() implementation the whole kernel.
It should be trivial to fix.

Could you do this?
Thomas Hellström (Intel) Oct. 3, 2019, 11:32 a.m. UTC | #2
Hi, Kirill,

On 10/3/19 1:02 PM, Kirill A. Shutemov wrote:
> On Wed, Oct 02, 2019 at 03:47:24PM +0200, Thomas Hellström (VMware) wrote:
>> From: Thomas Hellstrom <thellstrom@vmware.com>
>>
>> The caller needs to make sure that the vma is not torn down during the
>> lock operation and can also use the i_mmap_rwsem for file-backed vmas.
>> Remove the BUG_ON. We could, as an alternative, add a test that either
>> vma->vm_mm->mmap_sem or vma->vm_file->f_mapping->i_mmap_rwsem are held.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Rik van Riel <riel@surriel.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Huang Ying <ying.huang@intel.com>
>> Cc: Jérôme Glisse <jglisse@redhat.com>
>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> The patch looks good to me:
>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> But I looked at usage at pagewalk.c and it is inconsitent.  The walker
> takes ptl before calling ->pud_entry(), but not for ->pmd_entry().
>
> It should be fixed: do not take the lock before ->pud_entry(). The
> callback must take care of it.
>
> Looks like we have single ->pud_entry() implementation the whole kernel.
> It should be trivial to fix.
>
> Could you do this?
>
I could probably fix that. There are some comments in the patch 
introducing that code as to why it was done that way, though, but I 
don't remember offhand what the arguments were.

But there seems to be more races WRT puds. See my next email. Perhaps 
this should be fixed as part of a larger audit of the huge_pud code?

/Thomas
diff mbox series

Patch

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 93d5cf0bc716..0b84e13e88e2 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -216,7 +216,6 @@  static inline int is_swap_pmd(pmd_t pmd)
 static inline spinlock_t *pmd_trans_huge_lock(pmd_t *pmd,
 		struct vm_area_struct *vma)
 {
-	VM_BUG_ON_VMA(!rwsem_is_locked(&vma->vm_mm->mmap_sem), vma);
 	if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
 		return __pmd_trans_huge_lock(pmd, vma);
 	else
@@ -225,7 +224,6 @@  static inline spinlock_t *pmd_trans_huge_lock(pmd_t *pmd,
 static inline spinlock_t *pud_trans_huge_lock(pud_t *pud,
 		struct vm_area_struct *vma)
 {
-	VM_BUG_ON_VMA(!rwsem_is_locked(&vma->vm_mm->mmap_sem), vma);
 	if (pud_trans_huge(*pud) || pud_devmap(*pud))
 		return __pud_trans_huge_lock(pud, vma);
 	else