Message ID | 20240320001442.497813-1-jaegeuk@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [f2fs-dev] f2fs: avoid the deadlock case when stopping discard thread | expand |
On 2024/3/20 8:14, Jaegeuk Kim wrote: > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > - mnt_want_write_file() > - sb_start_write(SB_FREEZE_WRITE) > - sb_start_intwrite(SB_FREEZE_FS); > - f2fs_stop_checkpoint(sbi, false, : waiting > STOP_CP_REASON_SHUTDOWN); > - f2fs_stop_discard_thread(sbi); > - kthread_stop() > : waiting > > - mnt_drop_write_file(filp); > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Thanks,
On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > - mnt_want_write_file() > - sb_start_write(SB_FREEZE_WRITE) __sb_start_write() percpu_down_read() > - sb_start_intwrite(SB_FREEZE_FS); __sb_start_write() percpu_down_read() Given lock acquirers for read on both sides, wtf deadlock are you fixing? > - f2fs_stop_checkpoint(sbi, false, : waiting > STOP_CP_REASON_SHUTDOWN); > - f2fs_stop_discard_thread(sbi); > - kthread_stop() > : waiting > > - mnt_drop_write_file(filp); More important, feel free to add in spin. Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
On 03/22, Hillf Danton wrote: > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > > - mnt_want_write_file() > > - sb_start_write(SB_FREEZE_WRITE) > __sb_start_write() > percpu_down_read() > > - sb_start_intwrite(SB_FREEZE_FS); > __sb_start_write() > percpu_down_read() > > Given lock acquirers for read on both sides, wtf deadlock are you fixing? Damn. I couldn't think _write uses _read sem. > > > - f2fs_stop_checkpoint(sbi, false, : waiting > > STOP_CP_REASON_SHUTDOWN); > > - f2fs_stop_discard_thread(sbi); > > - kthread_stop() > > : waiting > > > > - mnt_drop_write_file(filp); > > More important, feel free to add in spin. I posted this patch before Light reported. And, in the report, I didn't get this: f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() like this order. -> freeze_bdev() -> thaw_bdev() -> f2fs_stop_discard_thread() Am I missing something? > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
On Thu, 21 Mar 2024 17:29:03 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > I posted this patch before Light reported. Yeah, his report's timestamp is 2024-03-20 6:59, nearly 7 hours later, which shows that you constructed the deadlock with nothing to do with his report. > > And, in the report, I didn't get this: > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() > like this order. > > -> freeze_bdev() > -> thaw_bdev() > -> f2fs_stop_discard_thread() > > Am I missing something? Light, could you specify to help Jaegeuk understand the deadlock you reported?
On 03/22, Light Hsieh (謝明燈) wrote: > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore. May I ask more details whether thaw_super() was called or not? > > > ________________________________ > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > 寄件日期: 2024年3月22日 上午 08:29 > 收件者: Hillf Danton <hdanton@sina.com> > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > > On 03/22, Hillf Danton wrote: > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > > > - mnt_want_write_file() > > > - sb_start_write(SB_FREEZE_WRITE) > > __sb_start_write() > > percpu_down_read() > > > - sb_start_intwrite(SB_FREEZE_FS); > > __sb_start_write() > > percpu_down_read() > > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing? > > Damn. I couldn't think _write uses _read sem. > > > > > > - f2fs_stop_checkpoint(sbi, false, : waiting > > > STOP_CP_REASON_SHUTDOWN); > > > - f2fs_stop_discard_thread(sbi); > > > - kthread_stop() > > > : waiting > > > > > > - mnt_drop_write_file(filp); > > > > More important, feel free to add in spin. > > I posted this patch before Light reported. > > And, in the report, I didn't get this: > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() > like this order. > > -> freeze_bdev() > -> thaw_bdev() > -> f2fs_stop_discard_thread() > > Am I missing something? > > > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com> >
On 03/22, Jaegeuk Kim wrote: > On 03/22, Light Hsieh (謝明燈) wrote: > > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore. > > May I ask more details whether thaw_super() was called or not? Ping? > > > > > > > ________________________________ > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > > 寄件日期: 2024年3月22日 上午 08:29 > > 收件者: Hillf Danton <hdanton@sina.com> > > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > > > > On 03/22, Hillf Danton wrote: > > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > > > > - mnt_want_write_file() > > > > - sb_start_write(SB_FREEZE_WRITE) > > > __sb_start_write() > > > percpu_down_read() > > > > - sb_start_intwrite(SB_FREEZE_FS); > > > __sb_start_write() > > > percpu_down_read() > > > > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing? > > > > Damn. I couldn't think _write uses _read sem. > > > > > > > > > - f2fs_stop_checkpoint(sbi, false, : waiting > > > > STOP_CP_REASON_SHUTDOWN); > > > > - f2fs_stop_discard_thread(sbi); > > > > - kthread_stop() > > > > : waiting > > > > > > > > - mnt_drop_write_file(filp); > > > > > > More important, feel free to add in spin. > > > > I posted this patch before Light reported. > > > > And, in the report, I didn't get this: > > > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). > > > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() > > like this order. > > > > -> freeze_bdev() > > -> thaw_bdev() > > -> f2fs_stop_discard_thread() > > > > Am I missing something? > > > > > > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com> > >
On 04/03, Light Hsieh (謝明燈) wrote: > Our log shows that thaw_super_locked() find that sb is readonly, so sb_freeze_unlock() is not invoked. > > static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) > { > ... > if (sb_rdonly(sb)) { > sb->s_writers.freeze_holders &= ~who; > sb->s_writers.frozen = SB_UNFROZEN; > wake_up_var(&sb->s_writers.frozen); > goto out; > } > ... > sb_freeze_unlock(sb, SB_FREEZE_FS); > out: > deactivate_locked_super(sb); > return 0; > } Thank you. Could you please take a look at this patch? https://lore.kernel.org/linux-f2fs-devel/20240404195254.556896-1-jaegeuk@kernel.org/T/#u > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > 寄件日期: 2024年3月27日 上午 12:52 > 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com> > 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > 主旨: Re: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > On 03/22, Jaegeuk Kim wrote: > > On 03/22, Light Hsieh (謝明燈) wrote: > > > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore. > > > > May I ask more details whether thaw_super() was called or not? > > Ping? > > > > > > > > > > > > ________________________________ > > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > > > 寄件日期: 2024年3月22日 上午 08:29 > > > 收件者: Hillf Danton <hdanton@sina.com> > > > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > > > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > > > > > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > > > > > > On 03/22, Hillf Danton wrote: > > > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > > > > > - mnt_want_write_file() > > > > > - sb_start_write(SB_FREEZE_WRITE) > > > > __sb_start_write() > > > > percpu_down_read() > > > > > - sb_start_intwrite(SB_FREEZE_FS); > > > > __sb_start_write() > > > > percpu_down_read() > > > > > > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing? > > > > > > Damn. I couldn't think _write uses _read sem. > > > > > > > > > > > > - f2fs_stop_checkpoint(sbi, false, : waiting > > > > > STOP_CP_REASON_SHUTDOWN); > > > > > - f2fs_stop_discard_thread(sbi); > > > > > - kthread_stop() > > > > > : waiting > > > > > > > > > > - mnt_drop_write_file(filp); > > > > > > > > More important, feel free to add in spin. > > > > > > I posted this patch before Light reported. > > > > > > And, in the report, I didn't get this: > > > > > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). > > > > > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() > > > like this order. > > > > > > -> freeze_bdev() > > > -> thaw_bdev() > > > -> f2fs_stop_discard_thread() > > > > > > Am I missing something? > > > > > > > > > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com> > > >
On 04/12, Light Hsieh (謝明燈) wrote: > I think 'readon' in this line may be typo of 'reason' Was fixed as well. Thanks. > > + f2fs_warn(sbi, "Stopped filesystem due to readon: %d", reason); > > > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > 寄件日期: 2024年4月5日 上午 03:55 > 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com> > 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > 主旨: Re: 回覆: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > On 04/03, Light Hsieh (謝明燈) wrote: > > Our log shows that thaw_super_locked() find that sb is readonly, so sb_freeze_unlock() is not invoked. > > > > static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) > > { > > ... > > if (sb_rdonly(sb)) { > > sb->s_writers.freeze_holders &= ~who; > > sb->s_writers.frozen = SB_UNFROZEN; > > wake_up_var(&sb->s_writers.frozen); > > goto out; > > } > > ... > > sb_freeze_unlock(sb, SB_FREEZE_FS); > > out: > > deactivate_locked_super(sb); > > return 0; > > } > > Thank you. Could you please take a look at this patch? > > https://lore.kernel.org/linux-f2fs-devel/20240404195254.556896-1-jaegeuk@kernel.org/T/#u > > > > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > > 寄件日期: 2024年3月27日 上午 12:52 > > 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com> > > 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > > 主旨: Re: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > > On 03/22, Jaegeuk Kim wrote: > > > On 03/22, Light Hsieh (謝明燈) wrote: > > > > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore. > > > > > > May I ask more details whether thaw_super() was called or not? > > > > Ping? > > > > > > > > > > > > > > > > > ________________________________ > > > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org> > > > > 寄件日期: 2024年3月22日 上午 08:29 > > > > 收件者: Hillf Danton <hdanton@sina.com> > > > > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net> > > > > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread > > > > > > > > > > > > External email : Please do not click links or open attachments until you have verified the sender or the content. > > > > > > > > On 03/22, Hillf Danton wrote: > > > > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread > > > > > > - mnt_want_write_file() > > > > > > - sb_start_write(SB_FREEZE_WRITE) > > > > > __sb_start_write() > > > > > percpu_down_read() > > > > > > - sb_start_intwrite(SB_FREEZE_FS); > > > > > __sb_start_write() > > > > > percpu_down_read() > > > > > > > > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing? > > > > > > > > Damn. I couldn't think _write uses _read sem. > > > > > > > > > > > > > > > - f2fs_stop_checkpoint(sbi, false, : waiting > > > > > > STOP_CP_REASON_SHUTDOWN); > > > > > > - f2fs_stop_discard_thread(sbi); > > > > > > - kthread_stop() > > > > > > : waiting > > > > > > > > > > > > - mnt_drop_write_file(filp); > > > > > > > > > > More important, feel free to add in spin. > > > > > > > > I posted this patch before Light reported. > > > > > > > > And, in the report, I didn't get this: > > > > > > > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write(). > > > > > > > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev() > > > > like this order. > > > > > > > > -> freeze_bdev() > > > > -> thaw_bdev() > > > > -> f2fs_stop_discard_thread() > > > > > > > > Am I missing something? > > > > > > > > > > > > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com> > > > >
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 4fd76e867e0a..088b8c48cffa 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -1923,7 +1923,9 @@ static int issue_discard_thread(void *data) continue; } - sb_start_intwrite(sbi->sb); + /* Avoid the deadlock from F2FS_GOING_DOWN_NOSYNC. */ + if (!sb_start_intwrite_trylock(sbi->sb)) + continue; issued = __issue_discard_cmd(sbi, &dpolicy); if (issued > 0) {
f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC) issue_discard_thread - mnt_want_write_file() - sb_start_write(SB_FREEZE_WRITE) - sb_start_intwrite(SB_FREEZE_FS); - f2fs_stop_checkpoint(sbi, false, : waiting STOP_CP_REASON_SHUTDOWN); - f2fs_stop_discard_thread(sbi); - kthread_stop() : waiting - mnt_drop_write_file(filp); Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- fs/f2fs/segment.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)