mbox series

[v4,0/2] Add a seqcount between gup_fast and copy_page_range()

Message ID 0-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com (mailing list archive)
Headers show
Series Add a seqcount between gup_fast and copy_page_range() | expand

Message

Jason Gunthorpe Nov. 10, 2020, 11:44 p.m. UTC
As discussed and suggested by Linus use a seqcount to close the small race
between gup_fast and copy_page_range().

Ahmed confirms that raw_write_seqcount_begin() is the correct API to use
in this case and it doesn't trigger any lockdeps.

I was able to test it using two threads, one forking and the other using
ibv_reg_mr() to trigger GUP fast. Modifying copy_page_range() to sleep
made the window large enough to reliably hit to test the logic.

v4:
 - Use read_seqcount_retry() not read_seqcount_t_retry
v3: https://lore.kernel.org/r/0-v3-7358966cab09+14e9-gup_fork_jgg@nvidia.com
 - Revise comment for write_protect_seq
 - Revise comment in copy_page_range
 - Use raw_write_seqcount_begin() not raw_write_seqcount_t_begin()
v2: https://lore.kernel.org/r/0-v2-dfe9ecdb6c74+2066-gup_fork_jgg@nvidia.com
 - Use start not addr in lockless_pages_from_mm
 - Replace unsigned long casts with using the proper variable type
 - Update comments
 - Use raw_write_seqcount_t_begin() instead of open coding
 - Update commit messages
v1: https://lore.kernel.org/r/0-v1-281e425c752f+2df-gup_fork_jgg@nvidia.com

To: linux-kernel@vger.kernel.org
To: Peter Xu <peterx@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Ahmed S. Darwish" <a.darwish@linutronix.de>

Jason Gunthorpe (2):
  mm: reorganize internal_get_user_pages_fast()
  mm: prevent gup_fast from racing with COW during fork

 arch/x86/kernel/tboot.c    |   1 +
 drivers/firmware/efi/efi.c |   1 +
 include/linux/mm_types.h   |   8 +++
 kernel/fork.c              |   1 +
 mm/gup.c                   | 117 +++++++++++++++++++++++--------------
 mm/init-mm.c               |   1 +
 mm/memory.c                |  13 ++++-
 7 files changed, 96 insertions(+), 46 deletions(-)

Comments

Linus Torvalds Nov. 11, 2020, 6:27 p.m. UTC | #1
On Tue, Nov 10, 2020 at 3:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> As discussed and suggested by Linus use a seqcount to close the small race
> between gup_fast and copy_page_range().
>
> Ahmed confirms that raw_write_seqcount_begin() is the correct API to use
> in this case and it doesn't trigger any lockdeps.
>
> I was able to test it using two threads, one forking and the other using
> ibv_reg_mr() to trigger GUP fast. Modifying copy_page_range() to sleep
> made the window large enough to reliably hit to test the logic.

Looks all good to me.

             Linus