Message ID | CA+55aFxeWe2VQaW30qGR0syiZ75jSwFwg3Ac+wS20KDtf5UKNw@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 5 December 2016 at 18:55, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum <vegard.nossum@gmail.com> wrote: >> >> The warning shows that it made it past the list_empty_careful() check >> in finish_wait() but then bugs out on the &wait->task_list >> dereference. >> >> Anything stick out? > > I hate that shmem waitqueue garbage. It's really subtle. > > I think the problem is that "wake_up_all()" in shmem_fallocate() > doesn't necessarily wake up everything. It wakes up TASK_NORMAL - > which does include TASK_UNINTERRUPTIBLE, but doesn't actually mean > "everything on the list". > > I think that what happens is that the waiters somehow move from > TASK_UNINTERRUPTIBLE to TASK_RUNNING early, and this means that > wake_up_all() will ignore them, leave them on the list, and now that > list on stack is no longer empty at the end. > > And the way *THAT* can happen is that the task is on some *other* > waitqueue as well, and that other waiqueue wakes it up. That's not > impossible, you can certainly have people on wait-queues that still > take faults. > > Or somebody just uses a directed wake_up_process() or something. > > Since you apparently can recreate this fairly easily, how about trying > this stupid patch? > > NOTE! This is entirely untested. I may have screwed this up entirely. > You get the idea, though - just remove the wait queue head from the > list - the list entries stay around, but nothing points to the stack > entry (that we're going to free) any more. > > And add the warning to see if this actually ever triggers (and because > I'd like to see the callchain when it does, to see if it's another > waitqueue somewhere or what..) ------------[ cut here ]------------ WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0 Kernel panic - not syncing: panic_on_warn set ... CPU: 22 PID: 14012 Comm: trinity-c73 Not tainted 4.9.0-rc7+ #220 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 ffff8801e32af970 ffffffff81fb08c1 ffffffff83e74b60 ffff8801e32afa48 ffffffff83ed7600 ffffffff847103e0 ffff8801e32afa38 ffffffff81515244 0000000041b58ab3 ffffffff844e21da ffffffff81515061 ffffffff8151591e Call Trace: [<ffffffff81fb08c1>] dump_stack+0x83/0xb2 [<ffffffff81515244>] panic+0x1e3/0x3ad [<ffffffff812708bf>] __warn+0x1bf/0x1e0 [<ffffffff81270aac>] warn_slowpath_null+0x2c/0x40 [<ffffffff8157aef7>] shmem_fallocate+0x9a7/0xac0 [<ffffffff8167c6c0>] vfs_fallocate+0x350/0x620 [<ffffffff815ee5c2>] SyS_madvise+0x432/0x1290 [<ffffffff8100524f>] do_syscall_64+0x1af/0x4d0 [<ffffffff83c965b4>] entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ Attached a full log. Vegard
On 5 December 2016 at 20:11, Vegard Nossum <vegard.nossum@gmail.com> wrote: > On 5 December 2016 at 18:55, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum <vegard.nossum@gmail.com> wrote: >> Since you apparently can recreate this fairly easily, how about trying >> this stupid patch? >> >> NOTE! This is entirely untested. I may have screwed this up entirely. >> You get the idea, though - just remove the wait queue head from the >> list - the list entries stay around, but nothing points to the stack >> entry (that we're going to free) any more. >> >> And add the warning to see if this actually ever triggers (and because >> I'd like to see the callchain when it does, to see if it's another >> waitqueue somewhere or what..) > > ------------[ cut here ]------------ > WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0 > Kernel panic - not syncing: panic_on_warn set ... So I noticed that panic_on_warn just after sending the email and I've been waiting for it it to trigger again. The warning has triggered twice more without panic_on_warn set and I haven't seen any crash yet. Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/shmem.c b/mm/shmem.c index 166ebf5d2bce..a80148b43476 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2665,6 +2665,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, spin_lock(&inode->i_lock); inode->i_private = NULL; wake_up_all(&shmem_falloc_waitq); + if (WARN_ON_ONCE(!list_empty(&shmem_falloc_waitq.task_list))) + list_del(&shmem_falloc_waitq.task_list); spin_unlock(&inode->i_lock); error = 0; goto out;