Message ID | 20240407065450.498821-1-ying.huang@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm,swap: add document about RCU read lock and swapoff interaction | expand |
On 07/04/2024 07:54, Huang Ying wrote: > During reviewing a patch to fix the race condition between > free_swap_and_cache() and swapoff() [1], it was found that the > document about how to prevent racing with swapoff isn't clear enough. > Especially RCU read lock can prevent swapoff from freeing data > structures. So, the document is added as comments. > > [1] https://lore.kernel.org/linux-mm/c8fe62d0-78b8-527a-5bef-ee663ccdc37a@huawei.com/ > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Minchan Kim <minchan@kernel.org> LGTM! Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > --- > mm/swapfile.c | 26 +++++++++++++------------- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4919423cce76..6925462406fa 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p, > > /* > * When we get a swap entry, if there aren't some other ways to > - * prevent swapoff, such as the folio in swap cache is locked, page > - * table lock is held, etc., the swap entry may become invalid because > - * of swapoff. Then, we need to enclose all swap related functions > - * with get_swap_device() and put_swap_device(), unless the swap > - * functions call get/put_swap_device() by themselves. > + * prevent swapoff, such as the folio in swap cache is locked, RCU > + * reader side is locked, etc., the swap entry may become invalid > + * because of swapoff. Then, we need to enclose all swap related > + * functions with get_swap_device() and put_swap_device(), unless the > + * swap functions call get/put_swap_device() by themselves. > * > - * Note that when only holding the PTL, swapoff might succeed immediately > - * after freeing a swap entry. Therefore, immediately after > - * __swap_entry_free(), the swap info might become stale and should not > - * be touched without a prior get_swap_device(). > + * RCU reader side lock (including any spinlock) is sufficient to > + * prevent swapoff, because synchronize_rcu() is called in swapoff() > + * before freeing data structures. > * > * Check whether swap entry is valid in the swap device. If so, > * return pointer to swap_info_struct, and keep the swap entry valid > @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) > > /* > * Wait for swap operations protected by get/put_swap_device() > - * to complete. > - * > - * We need synchronize_rcu() here to protect the accessing to > - * the swap cache data structure. > + * to complete. Because of synchronize_rcu() here, all swap > + * operations protected by RCU reader side lock (including any > + * spinlock) will be waited too. This makes it easy to > + * prevent folio_test_swapcache() and the following swap cache > + * operations from racing with swapoff. > */ > percpu_ref_kill(&p->users); > synchronize_rcu();
On 07.04.24 08:54, Huang Ying wrote: > During reviewing a patch to fix the race condition between > free_swap_and_cache() and swapoff() [1], it was found that the > document about how to prevent racing with swapoff isn't clear enough. > Especially RCU read lock can prevent swapoff from freeing data > structures. So, the document is added as comments. > > [1] https://lore.kernel.org/linux-mm/c8fe62d0-78b8-527a-5bef-ee663ccdc37a@huawei.com/ > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Minchan Kim <minchan@kernel.org> > --- > mm/swapfile.c | 26 +++++++++++++------------- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4919423cce76..6925462406fa 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p, > > /* > * When we get a swap entry, if there aren't some other ways to > - * prevent swapoff, such as the folio in swap cache is locked, page > - * table lock is held, etc., the swap entry may become invalid because > - * of swapoff. Then, we need to enclose all swap related functions > - * with get_swap_device() and put_swap_device(), unless the swap > - * functions call get/put_swap_device() by themselves. > + * prevent swapoff, such as the folio in swap cache is locked, RCU > + * reader side is locked, etc., the swap entry may become invalid > + * because of swapoff. Then, we need to enclose all swap related > + * functions with get_swap_device() and put_swap_device(), unless the > + * swap functions call get/put_swap_device() by themselves. > * > - * Note that when only holding the PTL, swapoff might succeed immediately > - * after freeing a swap entry. Therefore, immediately after > - * __swap_entry_free(), the swap info might become stale and should not > - * be touched without a prior get_swap_device(). > + * RCU reader side lock (including any spinlock) is sufficient to > + * prevent swapoff, because synchronize_rcu() is called in swapoff() > + * before freeing data structures. > * > * Check whether swap entry is valid in the swap device. If so, > * return pointer to swap_info_struct, and keep the swap entry valid > @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) > > /* > * Wait for swap operations protected by get/put_swap_device() > - * to complete. > - * > - * We need synchronize_rcu() here to protect the accessing to > - * the swap cache data structure. > + * to complete. Because of synchronize_rcu() here, all swap > + * operations protected by RCU reader side lock (including any > + * spinlock) will be waited too. This makes it easy to > + * prevent folio_test_swapcache() and the following swap cache > + * operations from racing with swapoff. > */ > percpu_ref_kill(&p->users); > synchronize_rcu(); Reviewed-by: David Hildenbrand <david@redhat.com>
On 2024/4/7 14:54, Huang Ying wrote: > During reviewing a patch to fix the race condition between > free_swap_and_cache() and swapoff() [1], it was found that the > document about how to prevent racing with swapoff isn't clear enough. > Especially RCU read lock can prevent swapoff from freeing data > structures. So, the document is added as comments. > > [1] https://lore.kernel.org/linux-mm/c8fe62d0-78b8-527a-5bef-ee663ccdc37a@huawei.com/ > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Minchan Kim <minchan@kernel.org> Thanks for your work. Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> . > --- > mm/swapfile.c | 26 +++++++++++++------------- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4919423cce76..6925462406fa 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p, > > /* > * When we get a swap entry, if there aren't some other ways to > - * prevent swapoff, such as the folio in swap cache is locked, page > - * table lock is held, etc., the swap entry may become invalid because > - * of swapoff. Then, we need to enclose all swap related functions > - * with get_swap_device() and put_swap_device(), unless the swap > - * functions call get/put_swap_device() by themselves. > + * prevent swapoff, such as the folio in swap cache is locked, RCU > + * reader side is locked, etc., the swap entry may become invalid > + * because of swapoff. Then, we need to enclose all swap related > + * functions with get_swap_device() and put_swap_device(), unless the > + * swap functions call get/put_swap_device() by themselves. > * > - * Note that when only holding the PTL, swapoff might succeed immediately > - * after freeing a swap entry. Therefore, immediately after > - * __swap_entry_free(), the swap info might become stale and should not > - * be touched without a prior get_swap_device(). > + * RCU reader side lock (including any spinlock) is sufficient to > + * prevent swapoff, because synchronize_rcu() is called in swapoff() > + * before freeing data structures. > * > * Check whether swap entry is valid in the swap device. If so, > * return pointer to swap_info_struct, and keep the swap entry valid > @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) > > /* > * Wait for swap operations protected by get/put_swap_device() > - * to complete. > - * > - * We need synchronize_rcu() here to protect the accessing to > - * the swap cache data structure. > + * to complete. Because of synchronize_rcu() here, all swap > + * operations protected by RCU reader side lock (including any > + * spinlock) will be waited too. This makes it easy to > + * prevent folio_test_swapcache() and the following swap cache > + * operations from racing with swapoff. > */ > percpu_ref_kill(&p->users); > synchronize_rcu(); >
diff --git a/mm/swapfile.c b/mm/swapfile.c index 4919423cce76..6925462406fa 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p, /* * When we get a swap entry, if there aren't some other ways to - * prevent swapoff, such as the folio in swap cache is locked, page - * table lock is held, etc., the swap entry may become invalid because - * of swapoff. Then, we need to enclose all swap related functions - * with get_swap_device() and put_swap_device(), unless the swap - * functions call get/put_swap_device() by themselves. + * prevent swapoff, such as the folio in swap cache is locked, RCU + * reader side is locked, etc., the swap entry may become invalid + * because of swapoff. Then, we need to enclose all swap related + * functions with get_swap_device() and put_swap_device(), unless the + * swap functions call get/put_swap_device() by themselves. * - * Note that when only holding the PTL, swapoff might succeed immediately - * after freeing a swap entry. Therefore, immediately after - * __swap_entry_free(), the swap info might become stale and should not - * be touched without a prior get_swap_device(). + * RCU reader side lock (including any spinlock) is sufficient to + * prevent swapoff, because synchronize_rcu() is called in swapoff() + * before freeing data structures. * * Check whether swap entry is valid in the swap device. If so, * return pointer to swap_info_struct, and keep the swap entry valid @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) /* * Wait for swap operations protected by get/put_swap_device() - * to complete. - * - * We need synchronize_rcu() here to protect the accessing to - * the swap cache data structure. + * to complete. Because of synchronize_rcu() here, all swap + * operations protected by RCU reader side lock (including any + * spinlock) will be waited too. This makes it easy to + * prevent folio_test_swapcache() and the following swap cache + * operations from racing with swapoff. */ percpu_ref_kill(&p->users); synchronize_rcu();
During reviewing a patch to fix the race condition between free_swap_and_cache() and swapoff() [1], it was found that the document about how to prevent racing with swapoff isn't clear enough. Especially RCU read lock can prevent swapoff from freeing data structures. So, the document is added as comments. [1] https://lore.kernel.org/linux-mm/c8fe62d0-78b8-527a-5bef-ee663ccdc37a@huawei.com/ Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> --- mm/swapfile.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)