Message ID | 9b9fcdda-c347-53ee-fdbb-8a7d11cf430e@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > > > > This report is stalling after mount() completed and process used remap_file_pages(). > > I think that we might need to use debug printk(). But I don't know what to examine. > > > > Andrew, can you pick up this debug printk() patch? > I guess we can get the result within one week. Sure, let's toss it in -next for a while. > >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Date: Fri, 20 Jul 2018 19:29:06 +0900 > Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem > > Among syzbot's unresolved hung task reports, 18 out of 65 reports contain > __getblk_gfp() line in the backtrace. Since there is a comment block that > says that __getblk_gfp() will lock up the machine if try_to_free_buffers() > attempt from grow_dev_page() is failing, let's start from checking whether > syzbot is hitting that case. This change will be removed after the bug is > fixed. I'm not sure that grow_dev_page() is hanging. It has often been suspected, but always is proven innocent. Lets see. > > ... > > @@ -978,6 +988,9 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) > spin_unlock(&inode->i_mapping->private_lock); > done: > ret = (block < end_block) ? 1 : -ENXIO; > +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT > + current->getblk_executed |= 0x08; > +#endif > failed: > unlock_page(page); > put_page(page); > @@ -1033,6 +1046,12 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) Something is wrong with your diff(1). That's grow_dev_page(), not blkdev_max_block(). > return NULL; > } > > +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT > + current->getblk_stamp = jiffies; AFACIT getblk_stamp didn't need to be in the task_struct - it could be a local. Doesn't matter much. > + current->getblk_executed = 0; > + current->getblk_bh_count = 0; > + current->getblk_bh_state = 0; > +#endif > for (;;) { > struct buffer_head *bh; > int ret; > @@ -1044,6 +1063,18 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) > ret = grow_buffers(bdev, block, size, gfp); > if (ret < 0) > return NULL; > + > +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT > + if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) > + continue; > + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", > + current->comm, current->pid, current->getblk_executed, > + current->getblk_bh_count, current->getblk_bh_state); > + current->getblk_executed = 0; > + current->getblk_bh_count = 0; > + current->getblk_bh_state = 0; > + current->getblk_stamp = jiffies; > +#endif > } > } > > @@ -3216,6 +3247,11 @@ int sync_dirty_buffer(struct buffer_head *bh) > */ > static inline int buffer_busy(struct buffer_head *bh) > { > +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT > + current->getblk_executed |= 0x80; > + current->getblk_bh_count = atomic_read(&bh->b_count); > + current->getblk_bh_state = bh->b_state; > +#endif Some explanation of your design wouldn't have hurt. What does getblk_executed do, why were these particular fields chosen? > return atomic_read(&bh->b_count) | > (bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock))); > } > > ... >
On 2018/07/21 5:06, Andrew Morton wrote: > On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > >>> >>> This report is stalling after mount() completed and process used remap_file_pages(). >>> I think that we might need to use debug printk(). But I don't know what to examine. >>> >> >> Andrew, can you pick up this debug printk() patch? >> I guess we can get the result within one week. > > Sure, let's toss it in -next for a while. > >> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Date: Fri, 20 Jul 2018 19:29:06 +0900 >> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem >> >> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain >> __getblk_gfp() line in the backtrace. Since there is a comment block that >> says that __getblk_gfp() will lock up the machine if try_to_free_buffers() >> attempt from grow_dev_page() is failing, let's start from checking whether >> syzbot is hitting that case. This change will be removed after the bug is >> fixed. > > I'm not sure that grow_dev_page() is hanging. It has often been > suspected, but always is proven innocent. Lets see. syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea?
On Tue 31-07-18 00:07:22, Tetsuo Handa wrote: > On 2018/07/21 5:06, Andrew Morton wrote: > > On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > > > >>> > >>> This report is stalling after mount() completed and process used remap_file_pages(). > >>> I think that we might need to use debug printk(). But I don't know what to examine. > >>> > >> > >> Andrew, can you pick up this debug printk() patch? > >> I guess we can get the result within one week. > > > > Sure, let's toss it in -next for a while. > > > >> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 > >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > >> Date: Fri, 20 Jul 2018 19:29:06 +0900 > >> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem > >> > >> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain > >> __getblk_gfp() line in the backtrace. Since there is a comment block that > >> says that __getblk_gfp() will lock up the machine if try_to_free_buffers() > >> attempt from grow_dev_page() is failing, let's start from checking whether > >> syzbot is hitting that case. This change will be removed after the bug is > >> fixed. > > > > I'm not sure that grow_dev_page() is hanging. It has often been > > suspected, but always is proven innocent. Lets see. > > syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . > It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea? Looks like some kind of a race where device block size gets changed while getblk() runs (and creates buffers for underlying page). I don't have time to nail it down at this moment can have a look into it later unless someone beats me to it. Honza
On 2018/08/06 19:09, Jan Kara wrote: >> syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . >> It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea? So far syzbot reproduced 7 times, and all reports are saying that grow_dev_page() is returning 1 but __find_get_block() is failing forever. INFO: task hung in __get_super https://syzkaller.appspot.com/text?tag=CrashLog&x=17347272400000 INFO: task hung in __blkdev_get (2) https://syzkaller.appspot.com/text?tag=CrashLog&x=144347fc400000 https://syzkaller.appspot.com/text?tag=CrashLog&x=12382bfc400000 INFO: task hung in __writeback_inodes_sb_nr https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 https://syzkaller.appspot.com/text?tag=CrashLog&x=13c2449c400000 INFO: task hung in filename_create https://syzkaller.appspot.com/text?tag=CrashLog&x=174a672c400000 INFO: task hung in lookup_slow https://syzkaller.appspot.com/text?tag=CrashLog&x=16113bdc400000 > > Looks like some kind of a race where device block size gets changed while > getblk() runs (and creates buffers for underlying page). I don't have time > to nail it down at this moment can have a look into it later unless someone > beats me to it. > It seems that loop device is relevant to this problem. 144347fc400000: [ 557.484258] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 560.491589] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 563.500432] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 566.508502] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 569.516546] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 572.524800] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 575.532499] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 575.803688] INFO: task syz-executor7:7893 blocked for more than 140 seconds. [ 575.810957] Not tainted 4.18.0-rc7-next-20180801+ #29 [ 575.816685] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 575.824663] syz-executor7 D24464 7893 4272 0x00000004 [ 575.830298] Call Trace: [ 575.832911] __schedule+0x87c/0x1ec0 [ 575.879299] schedule+0xfb/0x450 [ 575.912448] rwsem_down_read_failed+0x362/0x610 [ 575.951922] call_rwsem_down_read_failed+0x18/0x30 [ 575.956843] down_read+0xc3/0x1d0 [ 575.990222] __get_super.part.12+0x20f/0x2e0 [ 575.994654] get_super+0x2d/0x50 [ 575.998016] fsync_bdev+0x17/0xc0 [ 576.001469] invalidate_partition+0x35/0x60 [ 576.005800] drop_partitions.isra.13+0xe8/0x200 [ 576.023366] rescan_partitions+0x75/0x910 [ 576.037534] __blkdev_reread_part+0x1ad/0x230 [ 576.042023] blkdev_reread_part+0x26/0x40 [ 576.046173] loop_reread_partitions+0x163/0x190 [ 576.055795] loop_set_status+0xb95/0x1010 [ 576.059938] loop_set_status64+0xaa/0x100 [ 576.078629] lo_ioctl+0x90e/0x1d70 [ 576.095290] blkdev_ioctl+0x9cd/0x2030 [ 576.146746] block_ioctl+0xee/0x130 [ 576.154704] do_vfs_ioctl+0x1de/0x1720 [ 576.183069] ksys_ioctl+0xa9/0xd0 [ 576.186519] __x64_sys_ioctl+0x73/0xb0 [ 576.190423] do_syscall_64+0x1b9/0x820 It seems that there was loop_reread_partitions() failure before this message starts. 12382bfc400000: [ 205.467816] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 205.467816] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=1) [ 205.905418] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 205.905418] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=-16) [ 206.113592] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 206.113592] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=-16) [ 206.767225] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 206.767225] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=-16) [ 206.990060] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 206.990060] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=1) [ 208.630329] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 208.630329] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=-16) [ 228.101929] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 228.101929] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=-16) [ 228.605840] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 228.605840] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=1) [ 229.400629] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 229.400629] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=1) [ 229.622379] loop_reread_partitions: partition scan of loop0 (<E1><D3><F8>w<E5><EA><E4>S<E5>]}d<D0>^MI^A<BA><ED>!<F7><DE><92><A8>f<9B>8<CC><D6>W<DB><F5><AE>F5Eice^W<B5>^O<80>Z<E2>^H%<8D><BA>} [ 229.622379] <BE><8D> <8F>^OESC<CC><88>\<8B><A9>) failed (rc=1) [ 233.406342] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 236.414328] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 239.422327] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 242.430332] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 (...snipped...) [ 416.894327] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 419.902327] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 422.910325] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 425.918329] syz-executor0(18214): getblk(): executed=9 bh_count=0 bh_state=0 [ 431.005871] NMI backtrace for cpu 0 [ 431.005876] CPU: 0 PID: 18214 Comm: syz-executor0 Not tainted 4.18.0-rc6-next-20180725+ #18 [ 431.005881] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 431.005884] RIP: 0010:write_comp_data+0x14/0x70 [ 431.005893] Code: 4c 89 ef e8 1e 93 3e 00 e9 76 fc ff ff e8 b4 9c ca ff 90 90 90 90 55 65 4c 8b 04 25 40 ee 01 00 65 8b 05 2f 46 85 7e 48 89 e5 <a9> 00 01 1f 00 75 51 41 8b 80 98 12 00 00 83 f8 03 75 45 49 8b 80 [ 431.005896] RSP: 0018:ffff880192487130 EFLAGS: 00000046 [ 431.005903] RAX: 0000000080000000 RBX: ffff880168426b28 RCX: ffffffff81d704dd [ 431.005906] RDX: 000000000003085f RSI: 000000000000000d RDI: 0000000000000006 [ 431.005910] RBP: ffff880192487130 R08: ffff8801d9696200 R09: fffff94000af6a66 [ 431.005914] R10: fffff94000af6a66 R11: ffffea00057b5337 R12: 0000000000000006 [ 431.005918] R13: dffffc0000000000 R14: 0000000000000006 R15: 000000000003085f [ 431.005923] FS: 00007f6c291f1700(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000 [ 431.005926] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 431.005930] CR2: ffffffffff600400 CR3: 00000001c5ff4000 CR4: 00000000001406f0 [ 431.005935] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 431.005938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 431.005941] Call Trace: [ 431.005944] __sanitizer_cov_trace_cmp8+0x18/0x20 [ 431.005946] __find_get_block+0x1bd/0xe60 [ 431.005969] __getblk_gfp+0x3dc/0xe00 [ 431.005996] __bread_gfp+0x2d/0x310 [ 431.005999] fat__get_entry+0x59c/0xa30 [ 431.006017] fat_get_short_entry+0x13c/0x2c0 [ 431.006020] fat_subdirs+0x13c/0x290 [ 431.006043] fat_fill_super+0x29f6/0x4430 [ 431.006068] vfat_fill_super+0x31/0x40 [ 431.006070] mount_bdev+0x314/0x3e0 [ 431.006075] vfat_mount+0x3c/0x50 [ 431.006080] legacy_get_tree+0x131/0x460 [ 431.006083] vfs_get_tree+0x1cb/0x5c0 [ 431.006088] do_mount+0x6f2/0x1e20 [ 431.006113] ksys_mount+0x12d/0x140 [ 431.006116] __x64_sys_mount+0xbe/0x150 [ 431.006118] do_syscall_64+0x1b9/0x820 [ 431.006143] entry_SYSCALL_64_after_hwframe+0x49/0xbe There were a lot of directory bread failure messages before this message starts. 17347272400000: [ 431.497237] FAT-fs (loop3): Directory bread(block 2574) failed [ 431.505561] FAT-fs (loop3): Directory bread(block 2575) failed [ 431.516048] FAT-fs (loop3): Directory bread(block 2576) failed [ 431.525740] FAT-fs (loop3): Directory bread(block 2577) failed [ 431.535899] FAT-fs (loop3): Directory bread(block 2578) failed [ 431.636001] FAT-fs (loop5): bogus number of reserved sectors [ 431.641910] FAT-fs (loop5): Can't find a valid FAT filesystem [ 434.145080] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 437.152496] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 440.161598] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 443.169498] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 (...snipped...) [ 563.500432] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 566.508502] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 569.516546] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 572.524800] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 575.532499] syz-executor0(7883): getblk(): executed=9 bh_count=0 bh_state=0 [ 575.803688] INFO: task syz-executor7:7893 blocked for more than 140 seconds. [ 575.810957] Not tainted 4.18.0-rc7-next-20180801+ #29 [ 575.816685] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 575.824663] syz-executor7 D24464 7893 4272 0x00000004 [ 575.830298] Call Trace: [ 575.832911] __schedule+0x87c/0x1ec0 [ 575.879299] schedule+0xfb/0x450 [ 575.912448] rwsem_down_read_failed+0x362/0x610 [ 575.951922] call_rwsem_down_read_failed+0x18/0x30 [ 575.956843] down_read+0xc3/0x1d0 [ 575.990222] __get_super.part.12+0x20f/0x2e0 [ 575.994654] get_super+0x2d/0x50 [ 575.998016] fsync_bdev+0x17/0xc0 [ 576.001469] invalidate_partition+0x35/0x60 [ 576.005800] drop_partitions.isra.13+0xe8/0x200 [ 576.023366] rescan_partitions+0x75/0x910 [ 576.037534] __blkdev_reread_part+0x1ad/0x230 [ 576.042023] blkdev_reread_part+0x26/0x40 [ 576.046173] loop_reread_partitions+0x163/0x190 [ 576.055795] loop_set_status+0xb95/0x1010 [ 576.059938] loop_set_status64+0xaa/0x100 [ 576.078629] lo_ioctl+0x90e/0x1d70 [ 576.095290] blkdev_ioctl+0x9cd/0x2030 [ 576.146746] block_ioctl+0xee/0x130 [ 576.154704] do_vfs_ioctl+0x1de/0x1720 [ 576.183069] ksys_ioctl+0xa9/0xd0 [ 576.186519] __x64_sys_ioctl+0x73/0xb0 [ 576.190423] do_syscall_64+0x1b9/0x820 [ 576.246994] entry_SYSCALL_64_after_hwframe+0x49/0xbe
On 2018/08/06 20:56, Tetsuo Handa wrote: > On 2018/08/06 19:09, Jan Kara wrote: >> Looks like some kind of a race where device block size gets changed while >> getblk() runs (and creates buffers for underlying page). I don't have time >> to nail it down at this moment can have a look into it later unless someone >> beats me to it. >> > > It seems that loop device is relevant to this problem. Speak of loop device, I'm waiting for Jens for three months http://lkml.kernel.org/r/1527297408-4428-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp but Jens is too busy to come up with an alternative. Since that is a big patch, I wish we can start testing that patch before Jan starts writing a patch for fixing getblk() race problem.
On 2018/08/06 19:09, Jan Kara wrote: > On Tue 31-07-18 00:07:22, Tetsuo Handa wrote: >> On 2018/07/21 5:06, Andrew Morton wrote: >>> On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: >>> >>>>> >>>>> This report is stalling after mount() completed and process used remap_file_pages(). >>>>> I think that we might need to use debug printk(). But I don't know what to examine. >>>>> >>>> >>>> Andrew, can you pick up this debug printk() patch? >>>> I guess we can get the result within one week. >>> >>> Sure, let's toss it in -next for a while. >>> >>>> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 >>>> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >>>> Date: Fri, 20 Jul 2018 19:29:06 +0900 >>>> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem >>>> >>>> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain >>>> __getblk_gfp() line in the backtrace. Since there is a comment block that >>>> says that __getblk_gfp() will lock up the machine if try_to_free_buffers() >>>> attempt from grow_dev_page() is failing, let's start from checking whether >>>> syzbot is hitting that case. This change will be removed after the bug is >>>> fixed. >>> >>> I'm not sure that grow_dev_page() is hanging. It has often been >>> suspected, but always is proven innocent. Lets see. >> >> syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . >> It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea? > > Looks like some kind of a race where device block size gets changed while > getblk() runs (and creates buffers for underlying page). I don't have time > to nail it down at this moment can have a look into it later unless someone > beats me to it. I feel that the frequency of hitting this problem was decreased by merging loop module's ioctl() serialization patches. But this problem is still there, and syzbot got a new line in https://syzkaller.appspot.com/text?tag=CrashLog&x=177f889f400000 . [ 615.881781] __loop_clr_fd: partition scan of loop5 failed (rc=-22) [ 619.059920] syz-executor4(2193): getblk(): executed=cd bh_count=0 bh_state=29 [ 622.069808] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 [ 625.080013] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 [ 628.089900] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 I guess that loop module is somehow related to this problem.
On Fri 28-12-18 22:34:13, Tetsuo Handa wrote: > On 2018/08/06 19:09, Jan Kara wrote: > > On Tue 31-07-18 00:07:22, Tetsuo Handa wrote: > >> On 2018/07/21 5:06, Andrew Morton wrote: > >>> On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > >>> > >>>>> > >>>>> This report is stalling after mount() completed and process used remap_file_pages(). > >>>>> I think that we might need to use debug printk(). But I don't know what to examine. > >>>>> > >>>> > >>>> Andrew, can you pick up this debug printk() patch? > >>>> I guess we can get the result within one week. > >>> > >>> Sure, let's toss it in -next for a while. > >>> > >>>> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 > >>>> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > >>>> Date: Fri, 20 Jul 2018 19:29:06 +0900 > >>>> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem > >>>> > >>>> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain > >>>> __getblk_gfp() line in the backtrace. Since there is a comment block that > >>>> says that __getblk_gfp() will lock up the machine if try_to_free_buffers() > >>>> attempt from grow_dev_page() is failing, let's start from checking whether > >>>> syzbot is hitting that case. This change will be removed after the bug is > >>>> fixed. > >>> > >>> I'm not sure that grow_dev_page() is hanging. It has often been > >>> suspected, but always is proven innocent. Lets see. > >> > >> syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . > >> It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea? > > > > Looks like some kind of a race where device block size gets changed while > > getblk() runs (and creates buffers for underlying page). I don't have time > > to nail it down at this moment can have a look into it later unless someone > > beats me to it. > > I feel that the frequency of hitting this problem was decreased > by merging loop module's ioctl() serialization patches. But this > problem is still there, and syzbot got a new line in > https://syzkaller.appspot.com/text?tag=CrashLog&x=177f889f400000 . > > [ 615.881781] __loop_clr_fd: partition scan of loop5 failed (rc=-22) > [ 619.059920] syz-executor4(2193): getblk(): executed=cd bh_count=0 bh_state=29 > [ 622.069808] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > [ 625.080013] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > [ 628.089900] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > > I guess that loop module is somehow related to this problem. I had a look into this and the only good explanation for this I have is that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). If that would happen, we'd get exactly the behavior syzkaller observes because grow_buffers() would populate different page than __find_get_block() then looks up. However I don't see how that's possible since the filesystem has the block device open exclusively and blkdev_bszset() makes sure we also have exclusive access to the block device before changing the block device size. So changing block device block size after filesystem gets access to the device should be impossible. Anyway, could you perhaps add to your debug patch a dump of 'size' passed to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us whether my theory is right or not. Thanks! Honza
On Wed, Jan 2, 2019 at 3:40 PM Jan Kara <jack@suse.cz> wrote: > > On Fri 28-12-18 22:34:13, Tetsuo Handa wrote: > > On 2018/08/06 19:09, Jan Kara wrote: > > > On Tue 31-07-18 00:07:22, Tetsuo Handa wrote: > > >> On 2018/07/21 5:06, Andrew Morton wrote: > > >>> On Fri, 20 Jul 2018 19:36:23 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > > >>> > > >>>>> > > >>>>> This report is stalling after mount() completed and process used remap_file_pages(). > > >>>>> I think that we might need to use debug printk(). But I don't know what to examine. > > >>>>> > > >>>> > > >>>> Andrew, can you pick up this debug printk() patch? > > >>>> I guess we can get the result within one week. > > >>> > > >>> Sure, let's toss it in -next for a while. > > >>> > > >>>> >From 8f55e00b21fefffbc6abd9085ac503c52a302464 Mon Sep 17 00:00:00 2001 > > >>>> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > > >>>> Date: Fri, 20 Jul 2018 19:29:06 +0900 > > >>>> Subject: [PATCH] fs/buffer.c: add debug print for __getblk_gfp() stall problem > > >>>> > > >>>> Among syzbot's unresolved hung task reports, 18 out of 65 reports contain > > >>>> __getblk_gfp() line in the backtrace. Since there is a comment block that > > >>>> says that __getblk_gfp() will lock up the machine if try_to_free_buffers() > > >>>> attempt from grow_dev_page() is failing, let's start from checking whether > > >>>> syzbot is hitting that case. This change will be removed after the bug is > > >>>> fixed. > > >>> > > >>> I'm not sure that grow_dev_page() is hanging. It has often been > > >>> suspected, but always is proven innocent. Lets see. > > >> > > >> syzbot reproduced this problem ( https://syzkaller.appspot.com/text?tag=CrashLog&x=11f2fc44400000 ) . > > >> It says that grow_dev_page() is returning 1 but __find_get_block() is failing forever. Any idea? > > > > > > Looks like some kind of a race where device block size gets changed while > > > getblk() runs (and creates buffers for underlying page). I don't have time > > > to nail it down at this moment can have a look into it later unless someone > > > beats me to it. > > > > I feel that the frequency of hitting this problem was decreased > > by merging loop module's ioctl() serialization patches. But this > > problem is still there, and syzbot got a new line in > > https://syzkaller.appspot.com/text?tag=CrashLog&x=177f889f400000 . > > > > [ 615.881781] __loop_clr_fd: partition scan of loop5 failed (rc=-22) > > [ 619.059920] syz-executor4(2193): getblk(): executed=cd bh_count=0 bh_state=29 > > [ 622.069808] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > > [ 625.080013] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > > [ 628.089900] syz-executor4(2193): getblk(): executed=9 bh_count=0 bh_state=0 > > > > I guess that loop module is somehow related to this problem. > > I had a look into this and the only good explanation for this I have is > that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > If that would happen, we'd get exactly the behavior syzkaller observes > because grow_buffers() would populate different page than > __find_get_block() then looks up. > > However I don't see how that's possible since the filesystem has the block > device open exclusively and blkdev_bszset() makes sure we also have > exclusive access to the block device before changing the block device size. > So changing block device block size after filesystem gets access to the > device should be impossible. If this is that critical and impossible to fire, maybe it makes sense to add a corresponding debug check to some code paths? syzkaller will immediately catch any violations if they happen. > Anyway, could you perhaps add to your debug patch a dump of 'size' passed > to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > whether my theory is right or not. Thanks!
On 2019/01/02 23:40, Jan Kara wrote: > I had a look into this and the only good explanation for this I have is > that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > If that would happen, we'd get exactly the behavior syzkaller observes > because grow_buffers() would populate different page than > __find_get_block() then looks up. > > However I don't see how that's possible since the filesystem has the block > device open exclusively and blkdev_bszset() makes sure we also have > exclusive access to the block device before changing the block device size. > So changing block device block size after filesystem gets access to the > device should be impossible. > > Anyway, could you perhaps add to your debug patch a dump of 'size' passed > to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > whether my theory is right or not. Thanks! > OK. Andrew, will you add (or fold into) this change? From e6f334380ad2c87457bfc2a4058316c47f75824a Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 3 Jan 2019 01:03:35 +0900 Subject: [PATCH] fs/buffer.c: dump more info for __getblk_gfp() stall problem We need to dump more variables on top of "fs/buffer.c: add debug print for __getblk_gfp() stall problem". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jan Kara <jack@suse.cz> --- fs/buffer.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 580fda0..a50acac 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1066,9 +1066,14 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) #ifdef CONFIG_DEBUG_AID_FOR_SYZBOT if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) continue; - printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx " + "bdev_super_blocksize=%lu size=%u " + "bdev_super_blocksize_bits=%u bdev_inode_blkbits=%u\n", current->comm, current->pid, current->getblk_executed, - current->getblk_bh_count, current->getblk_bh_state); + current->getblk_bh_count, current->getblk_bh_state, + bdev->bd_super->s_blocksize, size, + bdev->bd_super->s_blocksize_bits, + bdev->bd_inode->i_blkbits); current->getblk_executed = 0; current->getblk_bh_count = 0; current->getblk_bh_state = 0;
On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > On 2019/01/02 23:40, Jan Kara wrote: > > I had a look into this and the only good explanation for this I have is > > that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > If that would happen, we'd get exactly the behavior syzkaller observes > > because grow_buffers() would populate different page than > > __find_get_block() then looks up. > > > > However I don't see how that's possible since the filesystem has the block > > device open exclusively and blkdev_bszset() makes sure we also have > > exclusive access to the block device before changing the block device size. > > So changing block device block size after filesystem gets access to the > > device should be impossible. > > > > Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > whether my theory is right or not. Thanks! > > > > OK. Andrew, will you add (or fold into) this change? > > From e6f334380ad2c87457bfc2a4058316c47f75824a Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Date: Thu, 3 Jan 2019 01:03:35 +0900 > Subject: [PATCH] fs/buffer.c: dump more info for __getblk_gfp() stall problem > > We need to dump more variables on top of > "fs/buffer.c: add debug print for __getblk_gfp() stall problem". > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Cc: Jan Kara <jack@suse.cz> > --- > fs/buffer.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/buffer.c b/fs/buffer.c > index 580fda0..a50acac 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1066,9 +1066,14 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) > #ifdef CONFIG_DEBUG_AID_FOR_SYZBOT > if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) > continue; > - printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", > + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx " > + "bdev_super_blocksize=%lu size=%u " > + "bdev_super_blocksize_bits=%u bdev_inode_blkbits=%u\n", > current->comm, current->pid, current->getblk_executed, > - current->getblk_bh_count, current->getblk_bh_state); > + current->getblk_bh_count, current->getblk_bh_state, > + bdev->bd_super->s_blocksize, size, > + bdev->bd_super->s_blocksize_bits, > + bdev->bd_inode->i_blkbits); Well, bd_super may be NULL if there's no filesystem mounted so it would be safer to check for this rather than blindly dereferencing it... Otherwise the change looks good to me. Honza
On 2019/01/03 2:26, Jan Kara wrote: > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: >> On 2019/01/02 23:40, Jan Kara wrote: >>> I had a look into this and the only good explanation for this I have is >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). >>> If that would happen, we'd get exactly the behavior syzkaller observes >>> because grow_buffers() would populate different page than >>> __find_get_block() then looks up. >>> >>> However I don't see how that's possible since the filesystem has the block >>> device open exclusively and blkdev_bszset() makes sure we also have >>> exclusive access to the block device before changing the block device size. >>> So changing block device block size after filesystem gets access to the >>> device should be impossible. >>> >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us >>> whether my theory is right or not. Thanks! >>> >> >> OK. Andrew, will you add (or fold into) this change? >> >> From e6f334380ad2c87457bfc2a4058316c47f75824a Mon Sep 17 00:00:00 2001 >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Date: Thu, 3 Jan 2019 01:03:35 +0900 >> Subject: [PATCH] fs/buffer.c: dump more info for __getblk_gfp() stall problem >> >> We need to dump more variables on top of >> "fs/buffer.c: add debug print for __getblk_gfp() stall problem". >> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Cc: Jan Kara <jack@suse.cz> >> --- >> fs/buffer.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/fs/buffer.c b/fs/buffer.c >> index 580fda0..a50acac 100644 >> --- a/fs/buffer.c >> +++ b/fs/buffer.c >> @@ -1066,9 +1066,14 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) >> #ifdef CONFIG_DEBUG_AID_FOR_SYZBOT >> if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) >> continue; >> - printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", >> + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx " >> + "bdev_super_blocksize=%lu size=%u " >> + "bdev_super_blocksize_bits=%u bdev_inode_blkbits=%u\n", >> current->comm, current->pid, current->getblk_executed, >> - current->getblk_bh_count, current->getblk_bh_state); >> + current->getblk_bh_count, current->getblk_bh_state, >> + bdev->bd_super->s_blocksize, size, >> + bdev->bd_super->s_blocksize_bits, >> + bdev->bd_inode->i_blkbits); > > Well, bd_super may be NULL if there's no filesystem mounted so it would be > safer to check for this rather than blindly dereferencing it... Otherwise > the change looks good to me. I see. Let's be cautious here. From 317a0d0002b3d2cadae606055ad50f2926ca62d2 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 3 Jan 2019 09:42:02 +0900 Subject: [PATCH v2] fs/buffer.c: dump more info for __getblk_gfp() stall problem We need to dump more variables on top of "fs/buffer.c: add debug print for __getblk_gfp() stall problem". Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jan Kara <jack@suse.cz> --- fs/buffer.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 580fda0..784de3d 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1066,9 +1066,15 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) #ifdef CONFIG_DEBUG_AID_FOR_SYZBOT if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) continue; - printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx bdev_super_blocksize=%ld size=%u bdev_super_blocksize_bits=%d bdev_inode_blkbits=%d\n", current->comm, current->pid, current->getblk_executed, - current->getblk_bh_count, current->getblk_bh_state); + current->getblk_bh_count, current->getblk_bh_state, + IS_ERR_OR_NULL(bdev->bd_super) ? -1L : + bdev->bd_super->s_blocksize, size, + IS_ERR_OR_NULL(bdev->bd_super) ? -1 : + bdev->bd_super->s_blocksize_bits, + IS_ERR_OR_NULL(bdev->bd_inode) ? -1 : + bdev->bd_inode->i_blkbits); current->getblk_executed = 0; current->getblk_bh_count = 0; current->getblk_bh_state = 0;
On 2019/01/03 2:26, Jan Kara wrote: > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: >> On 2019/01/02 23:40, Jan Kara wrote: >>> I had a look into this and the only good explanation for this I have is >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). >>> If that would happen, we'd get exactly the behavior syzkaller observes >>> because grow_buffers() would populate different page than >>> __find_get_block() then looks up. >>> >>> However I don't see how that's possible since the filesystem has the block >>> device open exclusively and blkdev_bszset() makes sure we also have >>> exclusive access to the block device before changing the block device size. >>> So changing block device block size after filesystem gets access to the >>> device should be impossible. >>> >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us >>> whether my theory is right or not. Thanks! >>> Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 (...snipped...) [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 https://syzkaller.appspot.com/text?tag=CrashLog&x=143383d7400000 [ 1355.681513][ T6893] syz-executor0(6893): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 [ 1358.274585][T15649] kworker/u4:17(15649): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 (...snipped...) [ 1455.341572][ T6893] syz-executor0(6893): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 [ 1455.541457][T15649] kworker/u4:17(15649): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 >> >> OK. Andrew, will you add (or fold into) this change? >> >> From e6f334380ad2c87457bfc2a4058316c47f75824a Mon Sep 17 00:00:00 2001 >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Date: Thu, 3 Jan 2019 01:03:35 +0900 >> Subject: [PATCH] fs/buffer.c: dump more info for __getblk_gfp() stall problem >> >> We need to dump more variables on top of >> "fs/buffer.c: add debug print for __getblk_gfp() stall problem". >> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Cc: Jan Kara <jack@suse.cz> >> --- >> fs/buffer.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/fs/buffer.c b/fs/buffer.c >> index 580fda0..a50acac 100644 >> --- a/fs/buffer.c >> +++ b/fs/buffer.c >> @@ -1066,9 +1066,14 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) >> #ifdef CONFIG_DEBUG_AID_FOR_SYZBOT >> if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) >> continue; >> - printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", >> + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx " >> + "bdev_super_blocksize=%lu size=%u " >> + "bdev_super_blocksize_bits=%u bdev_inode_blkbits=%u\n", >> current->comm, current->pid, current->getblk_executed, >> - current->getblk_bh_count, current->getblk_bh_state); >> + current->getblk_bh_count, current->getblk_bh_state, >> + bdev->bd_super->s_blocksize, size, >> + bdev->bd_super->s_blocksize_bits, >> + bdev->bd_inode->i_blkbits); > > Well, bd_super may be NULL if there's no filesystem mounted so it would be > safer to check for this rather than blindly dereferencing it... Otherwise > the change looks good to me. > > Honza >
On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > On 2019/01/03 2:26, Jan Kara wrote: > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > >> On 2019/01/02 23:40, Jan Kara wrote: > >>> I had a look into this and the only good explanation for this I have is > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > >>> If that would happen, we'd get exactly the behavior syzkaller observes > >>> because grow_buffers() would populate different page than > >>> __find_get_block() then looks up. > >>> > >>> However I don't see how that's possible since the filesystem has the block > >>> device open exclusively and blkdev_bszset() makes sure we also have > >>> exclusive access to the block device before changing the block device size. > >>> So changing block device block size after filesystem gets access to the > >>> device should be impossible. > >>> > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > >>> whether my theory is right or not. Thanks! > >>> > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > (...snipped...) > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 Right, so indeed the block size in the superblock and in the block device gets out of sync which explains why we endlessly loop in the buffer cache code. The superblock uses blocksize of 512 while the block device thinks the set block size is 4096. And after staring into the code for some time, I finally have a trivial reproducer: truncate -s 1G /tmp/image losetup /dev/loop0 /tmp/image mkfs.ext4 -b 1024 /dev/loop0 mount -t ext4 /dev/loop0 /mnt losetup -c /dev/loop0 l /mnt <hangs> And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block device block size to 4096 by calling bd_set_size(). I have to think how to best fix this... Thanks for your help with debugging this! Honza
On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote: > > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > > On 2019/01/03 2:26, Jan Kara wrote: > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > > >> On 2019/01/02 23:40, Jan Kara wrote: > > >>> I had a look into this and the only good explanation for this I have is > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > >>> If that would happen, we'd get exactly the behavior syzkaller observes > > >>> because grow_buffers() would populate different page than > > >>> __find_get_block() then looks up. > > >>> > > >>> However I don't see how that's possible since the filesystem has the block > > >>> device open exclusively and blkdev_bszset() makes sure we also have > > >>> exclusive access to the block device before changing the block device size. > > >>> So changing block device block size after filesystem gets access to the > > >>> device should be impossible. > > >>> > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > >>> whether my theory is right or not. Thanks! > > >>> > > > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > (...snipped...) > > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > Right, so indeed the block size in the superblock and in the block device > gets out of sync which explains why we endlessly loop in the buffer cache > code. The superblock uses blocksize of 512 while the block device thinks > the set block size is 4096. > > And after staring into the code for some time, I finally have a trivial > reproducer: > > truncate -s 1G /tmp/image > losetup /dev/loop0 /tmp/image > mkfs.ext4 -b 1024 /dev/loop0 > mount -t ext4 /dev/loop0 /mnt > losetup -c /dev/loop0 > l /mnt > <hangs> > > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block > device block size to 4096 by calling bd_set_size(). I have to think how to > best fix this... > > Thanks for your help with debugging this! Wow! I am very excited. We have 587 open "task hung" reports, I suspect this explains lots of them. What would be some pattern that we can use to best-effort distinguish most manifestations? Skimming through few reports I see "inode_lock", "get_super", "blkdev_put" as common indicators. Anything else?
On Tue 08-01-19 12:49:08, Dmitry Vyukov wrote: > On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote: > > > > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > > > On 2019/01/03 2:26, Jan Kara wrote: > > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > > > >> On 2019/01/02 23:40, Jan Kara wrote: > > > >>> I had a look into this and the only good explanation for this I have is > > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > > >>> If that would happen, we'd get exactly the behavior syzkaller observes > > > >>> because grow_buffers() would populate different page than > > > >>> __find_get_block() then looks up. > > > >>> > > > >>> However I don't see how that's possible since the filesystem has the block > > > >>> device open exclusively and blkdev_bszset() makes sure we also have > > > >>> exclusive access to the block device before changing the block device size. > > > >>> So changing block device block size after filesystem gets access to the > > > >>> device should be impossible. > > > >>> > > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > > >>> whether my theory is right or not. Thanks! > > > >>> > > > > > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > > > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > > > > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > (...snipped...) > > > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > Right, so indeed the block size in the superblock and in the block device > > gets out of sync which explains why we endlessly loop in the buffer cache > > code. The superblock uses blocksize of 512 while the block device thinks > > the set block size is 4096. > > > > And after staring into the code for some time, I finally have a trivial > > reproducer: > > > > truncate -s 1G /tmp/image > > losetup /dev/loop0 /tmp/image > > mkfs.ext4 -b 1024 /dev/loop0 > > mount -t ext4 /dev/loop0 /mnt > > losetup -c /dev/loop0 > > l /mnt > > <hangs> > > > > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block > > device block size to 4096 by calling bd_set_size(). I have to think how to > > best fix this... > > > > Thanks for your help with debugging this! > > Wow! I am very excited. > We have 587 open "task hung" reports, I suspect this explains lots of them. > What would be some pattern that we can use to best-effort distinguish > most manifestations? Skimming through few reports I see "inode_lock", > "get_super", "blkdev_put" as common indicators. Anything else? Well, there will be always looping task with __getblk_gfp() on its stack (which should be visible in the stacktrace generated by the stall detector). Then there can be lots of other processes getting blocked due to locks and other resources held by this task... Honza
On Wed, Jan 9, 2019 at 2:30 PM Jan Kara <jack@suse.cz> wrote: > > On Tue 08-01-19 12:49:08, Dmitry Vyukov wrote: > > On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > > > > On 2019/01/03 2:26, Jan Kara wrote: > > > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > > > > >> On 2019/01/02 23:40, Jan Kara wrote: > > > > >>> I had a look into this and the only good explanation for this I have is > > > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > > > >>> If that would happen, we'd get exactly the behavior syzkaller observes > > > > >>> because grow_buffers() would populate different page than > > > > >>> __find_get_block() then looks up. > > > > >>> > > > > >>> However I don't see how that's possible since the filesystem has the block > > > > >>> device open exclusively and blkdev_bszset() makes sure we also have > > > > >>> exclusive access to the block device before changing the block device size. > > > > >>> So changing block device block size after filesystem gets access to the > > > > >>> device should be impossible. > > > > >>> > > > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > > > >>> whether my theory is right or not. Thanks! > > > > >>> > > > > > > > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > > > > > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > > > > > > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > (...snipped...) > > > > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > > > Right, so indeed the block size in the superblock and in the block device > > > gets out of sync which explains why we endlessly loop in the buffer cache > > > code. The superblock uses blocksize of 512 while the block device thinks > > > the set block size is 4096. > > > > > > And after staring into the code for some time, I finally have a trivial > > > reproducer: > > > > > > truncate -s 1G /tmp/image > > > losetup /dev/loop0 /tmp/image > > > mkfs.ext4 -b 1024 /dev/loop0 > > > mount -t ext4 /dev/loop0 /mnt > > > losetup -c /dev/loop0 > > > l /mnt > > > <hangs> > > > > > > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block > > > device block size to 4096 by calling bd_set_size(). I have to think how to > > > best fix this... > > > > > > Thanks for your help with debugging this! > > > > Wow! I am very excited. > > We have 587 open "task hung" reports, I suspect this explains lots of them. > > What would be some pattern that we can use to best-effort distinguish > > most manifestations? Skimming through few reports I see "inode_lock", > > "get_super", "blkdev_put" as common indicators. Anything else? > > Well, there will be always looping task with __getblk_gfp() on its stack > (which should be visible in the stacktrace generated by the stall > detector). Then there can be lots of other processes getting blocked due to > locks and other resources held by this task... Once we have a fix, I plan to do a sweep over existing open "task hung" reports and dup lots of them onto this one. Probably preferring to over-sweep rather then to under-sweep because there are too many of them and lots does not seem to be actionable otherwise. Tetsuo, do you have comments before I start?
On Mon, Jan 14, 2019 at 4:11 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Wed, Jan 9, 2019 at 2:30 PM Jan Kara <jack@suse.cz> wrote: > > > > On Tue 08-01-19 12:49:08, Dmitry Vyukov wrote: > > > On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > > > > > On 2019/01/03 2:26, Jan Kara wrote: > > > > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > > > > > >> On 2019/01/02 23:40, Jan Kara wrote: > > > > > >>> I had a look into this and the only good explanation for this I have is > > > > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > > > > >>> If that would happen, we'd get exactly the behavior syzkaller observes > > > > > >>> because grow_buffers() would populate different page than > > > > > >>> __find_get_block() then looks up. > > > > > >>> > > > > > >>> However I don't see how that's possible since the filesystem has the block > > > > > >>> device open exclusively and blkdev_bszset() makes sure we also have > > > > > >>> exclusive access to the block device before changing the block device size. > > > > > >>> So changing block device block size after filesystem gets access to the > > > > > >>> device should be impossible. > > > > > >>> > > > > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > > > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > > > > >>> whether my theory is right or not. Thanks! > > > > > >>> > > > > > > > > > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > > > > > > > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > > > > > > > > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > > (...snipped...) > > > > > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > > > > > Right, so indeed the block size in the superblock and in the block device > > > > gets out of sync which explains why we endlessly loop in the buffer cache > > > > code. The superblock uses blocksize of 512 while the block device thinks > > > > the set block size is 4096. > > > > > > > > And after staring into the code for some time, I finally have a trivial > > > > reproducer: > > > > > > > > truncate -s 1G /tmp/image > > > > losetup /dev/loop0 /tmp/image > > > > mkfs.ext4 -b 1024 /dev/loop0 > > > > mount -t ext4 /dev/loop0 /mnt > > > > losetup -c /dev/loop0 > > > > l /mnt > > > > <hangs> > > > > > > > > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block > > > > device block size to 4096 by calling bd_set_size(). I have to think how to > > > > best fix this... > > > > > > > > Thanks for your help with debugging this! > > > > > > Wow! I am very excited. > > > We have 587 open "task hung" reports, I suspect this explains lots of them. > > > What would be some pattern that we can use to best-effort distinguish > > > most manifestations? Skimming through few reports I see "inode_lock", > > > "get_super", "blkdev_put" as common indicators. Anything else? > > > > Well, there will be always looping task with __getblk_gfp() on its stack > > (which should be visible in the stacktrace generated by the stall > > detector). Then there can be lots of other processes getting blocked due to > > locks and other resources held by this task... > > > Once we have a fix, I plan to do a sweep over existing open "task > hung" reports and dup lots of them onto this one. Probably preferring > to over-sweep rather then to under-sweep because there are too many of > them and lots does not seem to be actionable otherwise. > Tetsuo, do you have comments before I start? Also, is it possible to add some kind of WARNING for this condition? Taking into account how much effort it too to debug, looks like a useful check. Or did I ask this already...
On Mon 14-01-19 16:13:08, Dmitry Vyukov wrote: > On Mon, Jan 14, 2019 at 4:11 PM Dmitry Vyukov <dvyukov@google.com> wrote: > > > > On Wed, Jan 9, 2019 at 2:30 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Tue 08-01-19 12:49:08, Dmitry Vyukov wrote: > > > > On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > On Tue 08-01-19 19:04:06, Tetsuo Handa wrote: > > > > > > On 2019/01/03 2:26, Jan Kara wrote: > > > > > > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote: > > > > > > >> On 2019/01/02 23:40, Jan Kara wrote: > > > > > > >>> I had a look into this and the only good explanation for this I have is > > > > > > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits). > > > > > > >>> If that would happen, we'd get exactly the behavior syzkaller observes > > > > > > >>> because grow_buffers() would populate different page than > > > > > > >>> __find_get_block() then looks up. > > > > > > >>> > > > > > > >>> However I don't see how that's possible since the filesystem has the block > > > > > > >>> device open exclusively and blkdev_bszset() makes sure we also have > > > > > > >>> exclusive access to the block device before changing the block device size. > > > > > > >>> So changing block device block size after filesystem gets access to the > > > > > > >>> device should be impossible. > > > > > > >>> > > > > > > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed > > > > > > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us > > > > > > >>> whether my theory is right or not. Thanks! > > > > > > >>> > > > > > > > > > > > > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12. > > > > > > > > > > > > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000 > > > > > > > > > > > > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > > > (...snipped...) > > > > > > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12 > > > > > > > > > > Right, so indeed the block size in the superblock and in the block device > > > > > gets out of sync which explains why we endlessly loop in the buffer cache > > > > > code. The superblock uses blocksize of 512 while the block device thinks > > > > > the set block size is 4096. > > > > > > > > > > And after staring into the code for some time, I finally have a trivial > > > > > reproducer: > > > > > > > > > > truncate -s 1G /tmp/image > > > > > losetup /dev/loop0 /tmp/image > > > > > mkfs.ext4 -b 1024 /dev/loop0 > > > > > mount -t ext4 /dev/loop0 /mnt > > > > > losetup -c /dev/loop0 > > > > > l /mnt > > > > > <hangs> > > > > > > > > > > And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block > > > > > device block size to 4096 by calling bd_set_size(). I have to think how to > > > > > best fix this... > > > > > > > > > > Thanks for your help with debugging this! > > > > > > > > Wow! I am very excited. > > > > We have 587 open "task hung" reports, I suspect this explains lots of them. > > > > What would be some pattern that we can use to best-effort distinguish > > > > most manifestations? Skimming through few reports I see "inode_lock", > > > > "get_super", "blkdev_put" as common indicators. Anything else? > > > > > > Well, there will be always looping task with __getblk_gfp() on its stack > > > (which should be visible in the stacktrace generated by the stall > > > detector). Then there can be lots of other processes getting blocked due to > > > locks and other resources held by this task... > > > > > > Once we have a fix, I plan to do a sweep over existing open "task I have submitted the fix yesterday to linux-block ([PATCH 0/2] blkdev: Fix livelock when loop device updates capacity). > > hung" reports and dup lots of them onto this one. Probably preferring > > to over-sweep rather then to under-sweep because there are too many of > > them and lots does not seem to be actionable otherwise. > > Tetsuo, do you have comments before I start? > > Also, is it possible to add some kind of WARNING for this condition? > Taking into account how much effort it too to debug, looks like a > useful check. Or did I ask this already... There are two things we could do: 1) Warn if we loop in __getblk_slow() more than couple of times (looping once is normal, looping twice can happen easily due to races). 2) Warn & bail if block size passed to __getblk_slow() does not match the block device block size. I'll write the patch. Honza
diff --git a/fs/buffer.c b/fs/buffer.c index be31e28..ebf78ab 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -955,10 +955,20 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) end_block = init_page_buffers(page, bdev, (sector_t)index << sizebits, size); +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x01; +#endif goto done; } - if (!try_to_free_buffers(page)) + if (!try_to_free_buffers(page)) { +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x02; +#endif goto failed; + } +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x04; +#endif } /* @@ -978,6 +988,9 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) spin_unlock(&inode->i_mapping->private_lock); done: ret = (block < end_block) ? 1 : -ENXIO; +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x08; +#endif failed: unlock_page(page); put_page(page); @@ -1033,6 +1046,12 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) return NULL; } +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_stamp = jiffies; + current->getblk_executed = 0; + current->getblk_bh_count = 0; + current->getblk_bh_state = 0; +#endif for (;;) { struct buffer_head *bh; int ret; @@ -1044,6 +1063,18 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size) ret = grow_buffers(bdev, block, size, gfp); if (ret < 0) return NULL; + +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + if (!time_after(jiffies, current->getblk_stamp + 3 * HZ)) + continue; + printk(KERN_ERR "%s(%u): getblk(): executed=%x bh_count=%d bh_state=%lx\n", + current->comm, current->pid, current->getblk_executed, + current->getblk_bh_count, current->getblk_bh_state); + current->getblk_executed = 0; + current->getblk_bh_count = 0; + current->getblk_bh_state = 0; + current->getblk_stamp = jiffies; +#endif } } @@ -3216,6 +3247,11 @@ int sync_dirty_buffer(struct buffer_head *bh) */ static inline int buffer_busy(struct buffer_head *bh) { +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x80; + current->getblk_bh_count = atomic_read(&bh->b_count); + current->getblk_bh_state = bh->b_state; +#endif return atomic_read(&bh->b_count) | (bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock))); } @@ -3254,11 +3290,18 @@ int try_to_free_buffers(struct page *page) int ret = 0; BUG_ON(!PageLocked(page)); - if (PageWriteback(page)) + if (PageWriteback(page)) { +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x10; +#endif return 0; + } if (mapping == NULL) { /* can this still happen? */ ret = drop_buffers(page, &buffers_to_free); +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x20; +#endif goto out; } @@ -3282,6 +3325,9 @@ int try_to_free_buffers(struct page *page) if (ret) cancel_dirty_page(page); spin_unlock(&mapping->private_lock); +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + current->getblk_executed |= 0x40; +#endif out: if (buffers_to_free) { struct buffer_head *bh = buffers_to_free; diff --git a/include/linux/sched.h b/include/linux/sched.h index c4e0aba..95b143e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1207,6 +1207,13 @@ struct task_struct { unsigned long prev_lowest_stack; #endif +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT + unsigned long getblk_stamp; + unsigned int getblk_executed; + unsigned int getblk_bh_count; + unsigned long getblk_bh_state; +#endif + /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c731ff9..0747ce7 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -2058,3 +2058,9 @@ config IO_STRICT_DEVMEM if the driver using a given range cannot be disabled. If in doubt, say Y. + +config DEBUG_AID_FOR_SYZBOT + bool "Additional debug code for syzbot" + default n + help + This option is intended for testing by syzbot.