Message ID | CACT4Y+Yw_05S47dEYLWbwMCJvfB4cb9Mdb3q1qOYLyKfJMK1bA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, 03 Sep 2016 14:35:15 +0200, Dmitry Vyukov wrote: > > Hello, > > The following program causes either division error or hangs kernel: > > https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt > > > divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff88003c9b2280 task.stack: ffff880027280000 > RIP: 0010:[<ffffffff858e1a6c>] [< inline >] ktime_divns > include/linux/ktime.h:195 > RIP: 0010:[<ffffffff858e1a6c>] [<ffffffff858e1a6c>] > snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62 > RSP: 0018:ffff88003ed07dd8 EFLAGS: 00010006 > RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8 > RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048 > R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840 > R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80 > FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0 > DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 > Stack: > 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801 > ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001 > ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28 > Call Trace: > <IRQ> > [< inline >] __run_hrtimer kernel/time/hrtimer.c:1238 > [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 > kernel/time/hrtimer.c:1302 > [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336 > [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 > arch/x86/kernel/apic/apic.c:933 > [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 > arch/x86/kernel/apic/apic.c:957 > [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 > arch/x86/entry/entry_64.S:487 > <EOI> > [< inline >] rcu_lock_release include/linux/rcupdate.h:494 > [< inline >] rcu_read_unlock include/linux/rcupdate.h:927 > [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680 > [< inline >] page_remove_file_rmap mm/rmap.c:1343 > [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394 > [< inline >] zap_pte_range mm/memory.c:1170 > [< inline >] zap_pmd_range mm/memory.c:1257 > [< inline >] zap_pud_range mm/memory.c:1278 > [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299 > [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344 > [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0 > [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787 > [< inline >] __mmput kernel/fork.c:770 > [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790 > [< inline >] exit_mm kernel/exit.c:512 > [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815 > [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958 > [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307 > [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805 > [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210 > arch/x86/entry/common.c:163 > [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:198 > [< inline >] syscall_return_slowpath arch/x86/entry/common.c:267 > [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293 > [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25 > Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a > 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> > f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00 > RIP [< inline >] ktime_divns include/linux/ktime.h:195 > RIP [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 > sound/core/hrtimer.c:62 > RSP <ffff88003ed07dd8> > ---[ end trace e1a6d1e5a6136d50 ]--- > > > or: > > INFO: rcu_sched detected stalls on CPUs/tasks: > 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490 > (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20) > Task dump for CPU 0: > a.out R running task 28848 31756 4047 0x0000000a > 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430 > ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000 > 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000 > Call Trace: > [< inline >] spin_unlock_irqrestore ./include/linux/spinlock.h:362 > [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477 > [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606 > [< inline >] snd_timer_user_continue sound/core/timer.c:1835 > [< inline >] __snd_timer_user_ioctl sound/core/timer.c:1902 > [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917 > [< inline >] vfs_ioctl fs/ioctl.c:43 > [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675 > [< inline >] SYSC_ioctl fs/ioctl.c:690 > [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681 > [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1 > arch/x86/entry/entry_64.S:208 > > > > I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with > the following diff in sound/core/timer.c (one is a fix for another > division by 0): > > diff --git a/sound/core/timer.c b/sound/core/timer.c > index 9a6157e..3e55c6d 100644 > --- a/sound/core/timer.c > +++ b/sound/core/timer.c > @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, > struct snd_timer_id *tid, > timer->tmr_subdevice = tid->subdevice; > if (id) > strlcpy(timer->id, id, sizeof(timer->id)); > + timer->sticks = 1; > INIT_LIST_HEAD(&timer->device_list); > INIT_LIST_HEAD(&timer->open_list_head); > INIT_LIST_HEAD(&timer->active_list_head); > @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file > *file, char __user *buffer, > tu->qused--; > spin_unlock_irq(&tu->qlock); > > + mutex_lock(&tu->ioctl_lock); > if (tu->tread) { > if (copy_to_user(buffer, &tu->tqueue[qhead], > sizeof(struct snd_timer_tread))) > @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file > *file, char __user *buffer, > sizeof(struct snd_timer_read))) > err = -EFAULT; > } > + mutex_unlock(&tu->ioctl_lock); > > spin_lock_irq(&tu->qlock); > if (err < 0) > I couldn't trigger the bug any longer after merging for-linus branch of sound.git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git But I thought it got fixed by sticks=1 in the above. Maybe it's a combination, as it contains yet another timer fix by Vegard. The pull request will be sent in this week for 4.8-rc5. thanks, Takashi
On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote: > On Sat, 03 Sep 2016 14:35:15 +0200, > Dmitry Vyukov wrote: >> >> Hello, >> >> The following program causes either division error or hangs kernel: >> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt >> >> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN >> Dumping ftrace buffer: >> (ftrace buffer empty) >> Modules linked in: >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> task: ffff88003c9b2280 task.stack: ffff880027280000 >> RIP: 0010:[<ffffffff858e1a6c>] [< inline >] ktime_divns >> include/linux/ktime.h:195 >> RIP: 0010:[<ffffffff858e1a6c>] [<ffffffff858e1a6c>] >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62 >> RSP: 0018:ffff88003ed07dd8 EFLAGS: 00010006 >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000 >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8 >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048 >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840 >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80 >> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0 >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 >> Stack: >> 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801 >> ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001 >> ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28 >> Call Trace: >> <IRQ> >> [< inline >] __run_hrtimer kernel/time/hrtimer.c:1238 >> [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 >> kernel/time/hrtimer.c:1302 >> [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336 >> [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 >> arch/x86/kernel/apic/apic.c:933 >> [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 >> arch/x86/kernel/apic/apic.c:957 >> [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 >> arch/x86/entry/entry_64.S:487 >> <EOI> >> [< inline >] rcu_lock_release include/linux/rcupdate.h:494 >> [< inline >] rcu_read_unlock include/linux/rcupdate.h:927 >> [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680 >> [< inline >] page_remove_file_rmap mm/rmap.c:1343 >> [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394 >> [< inline >] zap_pte_range mm/memory.c:1170 >> [< inline >] zap_pmd_range mm/memory.c:1257 >> [< inline >] zap_pud_range mm/memory.c:1278 >> [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299 >> [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344 >> [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0 >> [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787 >> [< inline >] __mmput kernel/fork.c:770 >> [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790 >> [< inline >] exit_mm kernel/exit.c:512 >> [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815 >> [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958 >> [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307 >> [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805 >> [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210 >> arch/x86/entry/common.c:163 >> [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:198 >> [< inline >] syscall_return_slowpath arch/x86/entry/common.c:267 >> [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293 >> [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25 >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00 >> RIP [< inline >] ktime_divns include/linux/ktime.h:195 >> RIP [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 >> sound/core/hrtimer.c:62 >> RSP <ffff88003ed07dd8> >> ---[ end trace e1a6d1e5a6136d50 ]--- >> >> >> or: >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490 >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20) >> Task dump for CPU 0: >> a.out R running task 28848 31756 4047 0x0000000a >> 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430 >> ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000 >> 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000 >> Call Trace: >> [< inline >] spin_unlock_irqrestore ./include/linux/spinlock.h:362 >> [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477 >> [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606 >> [< inline >] snd_timer_user_continue sound/core/timer.c:1835 >> [< inline >] __snd_timer_user_ioctl sound/core/timer.c:1902 >> [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917 >> [< inline >] vfs_ioctl fs/ioctl.c:43 >> [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675 >> [< inline >] SYSC_ioctl fs/ioctl.c:690 >> [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681 >> [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1 >> arch/x86/entry/entry_64.S:208 >> >> >> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with >> the following diff in sound/core/timer.c (one is a fix for another >> division by 0): >> >> diff --git a/sound/core/timer.c b/sound/core/timer.c >> index 9a6157e..3e55c6d 100644 >> --- a/sound/core/timer.c >> +++ b/sound/core/timer.c >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, >> struct snd_timer_id *tid, >> timer->tmr_subdevice = tid->subdevice; >> if (id) >> strlcpy(timer->id, id, sizeof(timer->id)); >> + timer->sticks = 1; >> INIT_LIST_HEAD(&timer->device_list); >> INIT_LIST_HEAD(&timer->open_list_head); >> INIT_LIST_HEAD(&timer->active_list_head); >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file >> *file, char __user *buffer, >> tu->qused--; >> spin_unlock_irq(&tu->qlock); >> >> + mutex_lock(&tu->ioctl_lock); >> if (tu->tread) { >> if (copy_to_user(buffer, &tu->tqueue[qhead], >> sizeof(struct snd_timer_tread))) >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file >> *file, char __user *buffer, >> sizeof(struct snd_timer_read))) >> err = -EFAULT; >> } >> + mutex_unlock(&tu->ioctl_lock); >> >> spin_lock_irq(&tu->qlock); >> if (err < 0) >> > > I couldn't trigger the bug any longer after merging for-linus branch > of sound.git tree: > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git > > But I thought it got fixed by sticks=1 in the above. Maybe it's a > combination, as it contains yet another timer fix by Vegard. You mean that my local changes are a combination of two patches? But I would not expect that addition of mutex_lock/unlock can affect division by 0. Or you mean that there some other fix by Vegard? What is it? > The pull request will be sent in this week for 4.8-rc5.
On Tue, 06 Sep 2016 15:09:25 +0200, Dmitry Vyukov wrote: > > On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote: > > On Sat, 03 Sep 2016 14:35:15 +0200, > > Dmitry Vyukov wrote: > >> > >> Hello, > >> > >> The following program causes either division error or hangs kernel: > >> > >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt > >> > >> > >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > >> Dumping ftrace buffer: > >> (ftrace buffer empty) > >> Modules linked in: > >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8 > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > >> task: ffff88003c9b2280 task.stack: ffff880027280000 > >> RIP: 0010:[<ffffffff858e1a6c>] [< inline >] ktime_divns > >> include/linux/ktime.h:195 > >> RIP: 0010:[<ffffffff858e1a6c>] [<ffffffff858e1a6c>] > >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62 > >> RSP: 0018:ffff88003ed07dd8 EFLAGS: 00010006 > >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000 > >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8 > >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048 > >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840 > >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80 > >> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0 > >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 > >> Stack: > >> 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801 > >> ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001 > >> ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28 > >> Call Trace: > >> <IRQ> > >> [< inline >] __run_hrtimer kernel/time/hrtimer.c:1238 > >> [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 > >> kernel/time/hrtimer.c:1302 > >> [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336 > >> [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 > >> arch/x86/kernel/apic/apic.c:933 > >> [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 > >> arch/x86/kernel/apic/apic.c:957 > >> [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 > >> arch/x86/entry/entry_64.S:487 > >> <EOI> > >> [< inline >] rcu_lock_release include/linux/rcupdate.h:494 > >> [< inline >] rcu_read_unlock include/linux/rcupdate.h:927 > >> [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680 > >> [< inline >] page_remove_file_rmap mm/rmap.c:1343 > >> [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394 > >> [< inline >] zap_pte_range mm/memory.c:1170 > >> [< inline >] zap_pmd_range mm/memory.c:1257 > >> [< inline >] zap_pud_range mm/memory.c:1278 > >> [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299 > >> [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344 > >> [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0 > >> [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787 > >> [< inline >] __mmput kernel/fork.c:770 > >> [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790 > >> [< inline >] exit_mm kernel/exit.c:512 > >> [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815 > >> [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958 > >> [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307 > >> [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805 > >> [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210 > >> arch/x86/entry/common.c:163 > >> [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:198 > >> [< inline >] syscall_return_slowpath arch/x86/entry/common.c:267 > >> [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293 > >> [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25 > >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a > >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> > >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00 > >> RIP [< inline >] ktime_divns include/linux/ktime.h:195 > >> RIP [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 > >> sound/core/hrtimer.c:62 > >> RSP <ffff88003ed07dd8> > >> ---[ end trace e1a6d1e5a6136d50 ]--- > >> > >> > >> or: > >> > >> INFO: rcu_sched detected stalls on CPUs/tasks: > >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490 > >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20) > >> Task dump for CPU 0: > >> a.out R running task 28848 31756 4047 0x0000000a > >> 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430 > >> ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000 > >> 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000 > >> Call Trace: > >> [< inline >] spin_unlock_irqrestore ./include/linux/spinlock.h:362 > >> [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477 > >> [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606 > >> [< inline >] snd_timer_user_continue sound/core/timer.c:1835 > >> [< inline >] __snd_timer_user_ioctl sound/core/timer.c:1902 > >> [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917 > >> [< inline >] vfs_ioctl fs/ioctl.c:43 > >> [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675 > >> [< inline >] SYSC_ioctl fs/ioctl.c:690 > >> [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681 > >> [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1 > >> arch/x86/entry/entry_64.S:208 > >> > >> > >> > >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with > >> the following diff in sound/core/timer.c (one is a fix for another > >> division by 0): > >> > >> diff --git a/sound/core/timer.c b/sound/core/timer.c > >> index 9a6157e..3e55c6d 100644 > >> --- a/sound/core/timer.c > >> +++ b/sound/core/timer.c > >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, > >> struct snd_timer_id *tid, > >> timer->tmr_subdevice = tid->subdevice; > >> if (id) > >> strlcpy(timer->id, id, sizeof(timer->id)); > >> + timer->sticks = 1; > >> INIT_LIST_HEAD(&timer->device_list); > >> INIT_LIST_HEAD(&timer->open_list_head); > >> INIT_LIST_HEAD(&timer->active_list_head); > >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file > >> *file, char __user *buffer, > >> tu->qused--; > >> spin_unlock_irq(&tu->qlock); > >> > >> + mutex_lock(&tu->ioctl_lock); > >> if (tu->tread) { > >> if (copy_to_user(buffer, &tu->tqueue[qhead], > >> sizeof(struct snd_timer_tread))) > >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file > >> *file, char __user *buffer, > >> sizeof(struct snd_timer_read))) > >> err = -EFAULT; > >> } > >> + mutex_unlock(&tu->ioctl_lock); > >> > >> spin_lock_irq(&tu->qlock); > >> if (err < 0) > >> > > > > I couldn't trigger the bug any longer after merging for-linus branch > > of sound.git tree: > > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git > > > > But I thought it got fixed by sticks=1 in the above. Maybe it's a > > combination, as it contains yet another timer fix by Vegard. > > > You mean that my local changes are a combination of two patches? But I > would not expect that addition of mutex_lock/unlock can affect > division by 0. > Or you mean that there some other fix by Vegard? What is it? There is another fix 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 ALSA: timer: fix NULL pointer dereference on memory allocation failure But it's likely irrelevant. I retested now, and could see the latter stack trace, the rcu tall, while running the program, although it doesn't lead to a complete freeze and the system is restored once after the program quits. Do you really see the zero-division error with sticks=1 fix? That's unexpected. Takashi
On Tue, Sep 6, 2016 at 3:43 PM, Takashi Iwai <tiwai@suse.de> wrote: > On Tue, 06 Sep 2016 15:09:25 +0200, > Dmitry Vyukov wrote: >> >> On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote: >> > On Sat, 03 Sep 2016 14:35:15 +0200, >> > Dmitry Vyukov wrote: >> >> >> >> Hello, >> >> >> >> The following program causes either division error or hangs kernel: >> >> >> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt >> >> >> >> >> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN >> >> Dumping ftrace buffer: >> >> (ftrace buffer empty) >> >> Modules linked in: >> >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8 >> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> >> task: ffff88003c9b2280 task.stack: ffff880027280000 >> >> RIP: 0010:[<ffffffff858e1a6c>] [< inline >] ktime_divns >> >> include/linux/ktime.h:195 >> >> RIP: 0010:[<ffffffff858e1a6c>] [<ffffffff858e1a6c>] >> >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62 >> >> RSP: 0018:ffff88003ed07dd8 EFLAGS: 00010006 >> >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000 >> >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8 >> >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048 >> >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840 >> >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80 >> >> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 >> >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0 >> >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 >> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 >> >> Stack: >> >> 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801 >> >> ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001 >> >> ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28 >> >> Call Trace: >> >> <IRQ> >> >> [< inline >] __run_hrtimer kernel/time/hrtimer.c:1238 >> >> [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 >> >> kernel/time/hrtimer.c:1302 >> >> [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336 >> >> [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 >> >> arch/x86/kernel/apic/apic.c:933 >> >> [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 >> >> arch/x86/kernel/apic/apic.c:957 >> >> [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 >> >> arch/x86/entry/entry_64.S:487 >> >> <EOI> >> >> [< inline >] rcu_lock_release include/linux/rcupdate.h:494 >> >> [< inline >] rcu_read_unlock include/linux/rcupdate.h:927 >> >> [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680 >> >> [< inline >] page_remove_file_rmap mm/rmap.c:1343 >> >> [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394 >> >> [< inline >] zap_pte_range mm/memory.c:1170 >> >> [< inline >] zap_pmd_range mm/memory.c:1257 >> >> [< inline >] zap_pud_range mm/memory.c:1278 >> >> [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299 >> >> [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344 >> >> [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0 >> >> [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787 >> >> [< inline >] __mmput kernel/fork.c:770 >> >> [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790 >> >> [< inline >] exit_mm kernel/exit.c:512 >> >> [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815 >> >> [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958 >> >> [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307 >> >> [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805 >> >> [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210 >> >> arch/x86/entry/common.c:163 >> >> [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:198 >> >> [< inline >] syscall_return_slowpath arch/x86/entry/common.c:267 >> >> [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293 >> >> [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25 >> >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a >> >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> >> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00 >> >> RIP [< inline >] ktime_divns include/linux/ktime.h:195 >> >> RIP [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 >> >> sound/core/hrtimer.c:62 >> >> RSP <ffff88003ed07dd8> >> >> ---[ end trace e1a6d1e5a6136d50 ]--- >> >> >> >> >> >> or: >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490 >> >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20) >> >> Task dump for CPU 0: >> >> a.out R running task 28848 31756 4047 0x0000000a >> >> 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430 >> >> ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000 >> >> 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000 >> >> Call Trace: >> >> [< inline >] spin_unlock_irqrestore ./include/linux/spinlock.h:362 >> >> [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477 >> >> [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606 >> >> [< inline >] snd_timer_user_continue sound/core/timer.c:1835 >> >> [< inline >] __snd_timer_user_ioctl sound/core/timer.c:1902 >> >> [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917 >> >> [< inline >] vfs_ioctl fs/ioctl.c:43 >> >> [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675 >> >> [< inline >] SYSC_ioctl fs/ioctl.c:690 >> >> [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681 >> >> [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1 >> >> arch/x86/entry/entry_64.S:208 >> >> >> >> >> >> >> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with >> >> the following diff in sound/core/timer.c (one is a fix for another >> >> division by 0): >> >> >> >> diff --git a/sound/core/timer.c b/sound/core/timer.c >> >> index 9a6157e..3e55c6d 100644 >> >> --- a/sound/core/timer.c >> >> +++ b/sound/core/timer.c >> >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, >> >> struct snd_timer_id *tid, >> >> timer->tmr_subdevice = tid->subdevice; >> >> if (id) >> >> strlcpy(timer->id, id, sizeof(timer->id)); >> >> + timer->sticks = 1; >> >> INIT_LIST_HEAD(&timer->device_list); >> >> INIT_LIST_HEAD(&timer->open_list_head); >> >> INIT_LIST_HEAD(&timer->active_list_head); >> >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file >> >> *file, char __user *buffer, >> >> tu->qused--; >> >> spin_unlock_irq(&tu->qlock); >> >> >> >> + mutex_lock(&tu->ioctl_lock); >> >> if (tu->tread) { >> >> if (copy_to_user(buffer, &tu->tqueue[qhead], >> >> sizeof(struct snd_timer_tread))) >> >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file >> >> *file, char __user *buffer, >> >> sizeof(struct snd_timer_read))) >> >> err = -EFAULT; >> >> } >> >> + mutex_unlock(&tu->ioctl_lock); >> >> >> >> spin_lock_irq(&tu->qlock); >> >> if (err < 0) >> >> >> > >> > I couldn't trigger the bug any longer after merging for-linus branch >> > of sound.git tree: >> > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git >> > >> > But I thought it got fixed by sticks=1 in the above. Maybe it's a >> > combination, as it contains yet another timer fix by Vegard. >> >> >> You mean that my local changes are a combination of two patches? But I >> would not expect that addition of mutex_lock/unlock can affect >> division by 0. >> Or you mean that there some other fix by Vegard? What is it? > > There is another fix > 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 > ALSA: timer: fix NULL pointer dereference on memory allocation failure > But it's likely irrelevant. > > I retested now, and could see the latter stack trace, the rcu tall, > while running the program, although it doesn't lead to a complete > freeze and the system is restored once after the program quits. > > Do you really see the zero-division error with sticks=1 fix? That's > unexpected. Yes, just reproduced it again. I am on 0693c28cfc8e25f18c25b65a8942c026f1854a3c of linux-next now, so I already have: commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE Compiled the program verbatim and run it using stress utility (http s://godoc.org/golang.org/x/tools/cmd/stress namely just runs the program in parallel loop) and instantly get the following crash. Attaching my config just in case. I start qemu as: qemu-system-x86_64 -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic \ -no-reboot -display none -serial stdio -kernel arch/x86/boot/bzImage \ -fsdev local,id=fsdev0,path=/,security_model=none,readonly \ -device virtio-9p-pci,fsdev=fsdev0,mount_tag=/dev/root \ -enable-kvm -pidfile vm_pid -m 2G -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 -smp sockets=2,cores=2,threads=1 \ -usb -usbdevice mouse -usbdevice tablet -soundhw all \ -append "console=ttyS0 debug oops=panic panic_on_warn=1 panic=-1 earlyprintk=serial slub_debug=UZ root=/dev/root rootfstype=9p rootflags=trans=virtio,version=9p2000.L,cache=loose init=/usr/local/google/home/dvyukov/bin/9p/init.sh" sound dir is: -bash-4.3# ls -lt /dev/snd/ total 0 crw------- 1 root 0 116, 20 Sep 6 14:03 hwC4D0 crw------- 1 root 0 116, 19 Sep 6 14:03 pcmC4D0c crw------- 1 root 0 116, 18 Sep 6 14:03 pcmC4D0p crw------- 1 root 0 116, 17 Sep 6 14:03 controlC4 crw------- 1 root 0 116, 16 Sep 6 14:03 pcmC3D0p crw------- 1 root 0 116, 15 Sep 6 14:03 controlC3 crw------- 1 root 0 116, 14 Sep 6 14:03 midiC2D3 crw------- 1 root 0 116, 13 Sep 6 14:03 midiC2D2 crw------- 1 root 0 116, 12 Sep 6 14:03 midiC2D1 crw------- 1 root 0 116, 11 Sep 6 14:03 midiC2D0 crw------- 1 root 0 116, 10 Sep 6 14:03 controlC2 crw------- 1 root 0 116, 9 Sep 6 14:03 pcmC1D1c crw------- 1 root 0 116, 8 Sep 6 14:03 pcmC1D1p crw------- 1 root 0 116, 7 Sep 6 14:03 pcmC1D0c crw------- 1 root 0 116, 6 Sep 6 14:03 pcmC1D0p crw------- 1 root 0 116, 5 Sep 6 14:03 controlC1 crw------- 1 root 0 116, 4 Sep 6 14:03 pcmC0D0c crw------- 1 root 0 116, 3 Sep 6 14:03 pcmC0D0p crw------- 1 root 0 116, 2 Sep 6 14:03 controlC0 crw------- 1 root 0 116, 1 Sep 6 14:03 seq crw------- 1 root 0 116, 33 Sep 6 14:03 timer [ 134.435009] divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN [ 134.435009] Modules linked in: [ 134.435009] CPU: 3 PID: 4965 Comm: a.out Not tainted 4.8.0-rc5-next-20160905+ #14 [ 134.435009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 134.435009] task: ffff880038b6e200 task.stack: ffff880036c48000 [ 134.435009] RIP: 0010:[<ffffffff858e147c>] [<ffffffff858e147c>] snd_hrtimer_callback+0x1bc/0x3c0 [ 134.435009] RSP: 0018:ffff88006d307dd8 EFLAGS: 00010006 [ 134.435009] RAX: 00000000002b5b0c RBX: ffff88006bb7d608 RCX: 0000000000000000 [ 134.435009] RDX: 0000000000000000 RSI: ffff88006bb7d620 RDI: ffffffff886b69a8 [ 134.435009] RBP: ffff88006d307e30 R08: 0000002066910785 R09: ffff88007fff70c8 [ 134.435009] R10: ffff88007fff70d8 R11: ffff88007fff70d0 R12: ffff880038ae3d00 [ 134.435009] R13: ffff880038ae3db4 R14: ffff880038ae3e30 R15: ffff88006bb7d600 [ 134.435009] FS: 00000000009f6940(0000) GS:ffff88006d300000(0000) knlGS:0000000000000000 [ 134.435009] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 134.435009] CR2: 000000002000ffb0 CR3: 000000006a816000 CR4: 00000000000006e0 [ 134.435009] Stack: [ 134.435009] 0000000000000000 ffff88006bb7d620 ffff880038ae3db8 00000000002b5b0c [ 134.435009] ffff88006bb7d660 0000000000000000 ffff88006bb7d608 0000000000000003 [ 134.435009] ffff88006d31b400 dffffc0000000000 0000000000000003 ffff88006d307f28 [ 134.447443] Call Trace: [ 134.447443] <IRQ> [ 134.447443] [<ffffffff81504425>] __hrtimer_run_queues+0x325/0xe70 [ 134.447443] [<ffffffff858e12c0>] ? snd_hrtimer_close+0x130/0x130 [ 134.447443] [<ffffffff81504100>] ? enqueue_hrtimer+0x3d0/0x3d0 [ 134.447443] [<ffffffff81506ddb>] hrtimer_interrupt+0x18b/0x420 [ 134.447443] [<ffffffff8126ceff>] local_apic_timer_interrupt+0x6f/0xe0 [ 134.450051] [<ffffffff86e1d216>] smp_apic_timer_interrupt+0x76/0xa0 [ 134.450051] [<ffffffff86e1c2cc>] apic_timer_interrupt+0x8c/0xa0 [ 134.450051] <EOI> [ 134.450051] [<ffffffff86e1a63e>] ? _raw_spin_unlock_irqrestore+0x5e/0xc0 [ 134.450051] [<ffffffff858daded>] snd_timer_start1+0xdd/0x740 [ 134.450051] [<ffffffff858db535>] snd_timer_continue+0x45/0x80 [ 134.450051] [<ffffffff858deca9>] snd_timer_user_ioctl+0xe99/0x2470 [ 134.450051] [<ffffffff858dde10>] ? snd_timer_user_params.isra.22+0xa30/0xa30 [ 134.450051] [<ffffffff8183bca0>] ? mem_cgroup_count_precharge_pte_range+0x530/0x530 [ 134.450051] [<ffffffff8183c03f>] ? get_mem_cgroup_from_mm+0x39f/0x4a0 [ 134.450051] [<ffffffff817129b4>] ? __lru_cache_add+0x174/0x210 [ 134.450051] [<ffffffff81715f49>] ? lru_cache_add+0xd9/0x1e0 [ 134.450051] [<ffffffff81791a30>] ? handle_mm_fault+0x28a0/0x34e0 [ 134.450051] [<ffffffff86e19c92>] ? _raw_spin_unlock+0x22/0x30 [ 134.450051] [<ffffffff8178fb40>] ? handle_mm_fault+0x9b0/0x34e0 [ 134.450051] [<ffffffff8178f2ca>] ? handle_mm_fault+0x13a/0x34e0 [ 134.450051] [<ffffffff8178f190>] ? __pmd_alloc+0x3f0/0x3f0 [ 134.450051] [<ffffffff818c902d>] ? __fd_install+0x24d/0x590 [ 134.450051] [<ffffffff818c8de0>] ? get_unused_fd_flags+0xd0/0xd0 [ 134.450051] [<ffffffff858dde10>] ? snd_timer_user_params.isra.22+0xa30/0xa30 [ 134.450051] [<ffffffff818a331c>] do_vfs_ioctl+0x18c/0x1080 [ 134.450051] [<ffffffff8129d19c>] ? __do_page_fault+0x2cc/0xbb0 [ 134.450051] [<ffffffff818a3190>] ? ioctl_preallocate+0x210/0x210 [ 134.450051] [<ffffffff8129d349>] ? __do_page_fault+0x479/0xbb0 [ 134.450051] [<ffffffff8148ef0a>] ? up_read+0x1a/0x40 [ 134.450051] [<ffffffff8129d068>] ? __do_page_fault+0x198/0xbb0 [ 134.450051] [<ffffffff82ad1dc9>] ? security_file_ioctl+0x89/0xb0 [ 134.450051] [<ffffffff818a429f>] SyS_ioctl+0x8f/0xc0 [ 134.450051] [<ffffffff86e1a8c0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 134.450051] Code: 7b ad ce fb 8b 0d e5 f1 6f 05 48 0f af 4d d0 48 85 c9 0f 88 6a 01 00 00 48 89 4d a8 e8 5e ad ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 48 01 45 d0 e8 48 ad ce fb 4c 89 fa 48 b8 00 00 00 00 [ 134.450051] RIP [<ffffffff858e147c>] snd_hrtimer_callback+0x1bc/0x3c0 [ 134.450051] RSP <ffff88006d307dd8> [ 134.450051] ---[ end trace 8614aafe29f5fee0 ]--- [ 134.450051] Kernel panic - not syncing: Fatal exception in interrupt [ 134.450051] Kernel Offset: disabled
On Tue, Sep 6, 2016 at 3:43 PM, Takashi Iwai <tiwai@suse.de> wrote: > On Tue, 06 Sep 2016 15:09:25 +0200, > Dmitry Vyukov wrote: >> >> On Tue, Sep 6, 2016 at 3:00 PM, Takashi Iwai <tiwai@suse.de> wrote: >> > On Sat, 03 Sep 2016 14:35:15 +0200, >> > Dmitry Vyukov wrote: >> >> >> >> Hello, >> >> >> >> The following program causes either division error or hangs kernel: >> >> >> >> https://gist.githubusercontent.com/dvyukov/b1f33be3bfcc15d629e10db483bee1e4/raw/1574b407e0456bf8277bee423e7731641f5f2c14/gistfile1.txt >> >> >> >> >> >> divide error: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN >> >> Dumping ftrace buffer: >> >> (ftrace buffer empty) >> >> Modules linked in: >> >> CPU: 1 PID: 27320 Comm: syz-executor Not tainted 4.8.0-rc3-next-20160825+ #8 >> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> >> task: ffff88003c9b2280 task.stack: ffff880027280000 >> >> RIP: 0010:[<ffffffff858e1a6c>] [< inline >] ktime_divns >> >> include/linux/ktime.h:195 >> >> RIP: 0010:[<ffffffff858e1a6c>] [<ffffffff858e1a6c>] >> >> snd_hrtimer_callback+0x1bc/0x3c0 sound/core/hrtimer.c:62 >> >> RSP: 0018:ffff88003ed07dd8 EFLAGS: 00010006 >> >> RAX: 0000000000004801 RBX: ffff88003d3a0a88 RCX: 0000000000000000 >> >> RDX: 0000000000000000 RSI: ffff88003d3a0aa0 RDI: ffffffff886b69a8 >> >> RBP: ffff88003ed07e30 R08: 0000010e85417665 R09: ffff88007fff7048 >> >> R10: ffff88007fff7058 R11: ffff88007fff7050 R12: ffff88003d00f840 >> >> R13: ffff88003d00f8f4 R14: ffff88003d00f970 R15: ffff88003d3a0a80 >> >> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000 >> >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> >> CR2: 00007f4152126db8 CR3: 000000006b896000 CR4: 00000000000006e0 >> >> DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000 >> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 >> >> Stack: >> >> 0000000000000000 ffff88003d3a0aa0 ffff88003d00f8f8 0000000000004801 >> >> ffff88003d3a0ae0 0000000000000000 ffff88003d3a0a88 0000000000000001 >> >> ffff88003ed1b400 dffffc0000000000 0000000000000001 ffff88003ed07f28 >> >> Call Trace: >> >> <IRQ> >> >> [< inline >] __run_hrtimer kernel/time/hrtimer.c:1238 >> >> [<ffffffff81504335>] __hrtimer_run_queues+0x325/0xe70 >> >> kernel/time/hrtimer.c:1302 >> >> [<ffffffff81506ceb>] hrtimer_interrupt+0x18b/0x420 kernel/time/hrtimer.c:1336 >> >> [<ffffffff8126d8df>] local_apic_timer_interrupt+0x6f/0xe0 >> >> arch/x86/kernel/apic/apic.c:933 >> >> [<ffffffff86e13056>] smp_apic_timer_interrupt+0x76/0xa0 >> >> arch/x86/kernel/apic/apic.c:957 >> >> [<ffffffff86e1210c>] apic_timer_interrupt+0x8c/0xa0 >> >> arch/x86/entry/entry_64.S:487 >> >> <EOI> >> >> [< inline >] rcu_lock_release include/linux/rcupdate.h:494 >> >> [< inline >] rcu_read_unlock include/linux/rcupdate.h:927 >> >> [<ffffffff81839f55>] unlock_page_memcg+0x95/0x130 mm/memcontrol.c:1680 >> >> [< inline >] page_remove_file_rmap mm/rmap.c:1343 >> >> [<ffffffff817b742c>] page_remove_rmap+0x1cc/0x8e0 mm/rmap.c:1394 >> >> [< inline >] zap_pte_range mm/memory.c:1170 >> >> [< inline >] zap_pmd_range mm/memory.c:1257 >> >> [< inline >] zap_pud_range mm/memory.c:1278 >> >> [<ffffffff81789119>] unmap_page_range+0xf89/0x1ba0 mm/memory.c:1299 >> >> [<ffffffff81789e31>] unmap_single_vma+0x101/0x260 mm/memory.c:1344 >> >> [<ffffffff8178a741>] unmap_vmas+0xf1/0x1b0 >> >> [<ffffffff817a5aab>] exit_mmap+0x22b/0x420 mm/mmap.c:2787 >> >> [< inline >] __mmput kernel/fork.c:770 >> >> [<ffffffff8137c576>] mmput+0xd6/0x3c0 kernel/fork.c:790 >> >> [< inline >] exit_mm kernel/exit.c:512 >> >> [<ffffffff813940e8>] do_exit+0x738/0x2e70 kernel/exit.c:815 >> >> [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958 >> >> [<ffffffff813ba4aa>] get_signal+0x62a/0x15d0 kernel/signal.c:2307 >> >> [<ffffffff81202333>] do_signal+0x83/0x1f60 arch/x86/kernel/signal.c:805 >> >> [<ffffffff81006345>] exit_to_usermode_loop+0x1a5/0x210 >> >> arch/x86/entry/common.c:163 >> >> [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:198 >> >> [< inline >] syscall_return_slowpath arch/x86/entry/common.c:267 >> >> [<ffffffff81008be0>] do_syscall_64+0x4c0/0x640 arch/x86/entry/common.c:293 >> >> [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25 >> >> Code: cb a2 ce fb 8b 0d f5 21 70 05 48 0f af 4d d0 48 85 c9 0f 88 6a >> >> 01 00 00 48 89 4d a8 e8 ae a2 ce fb 48 8b 45 c0 48 8b 4d a8 48 99 <48> >> >> f7 f9 48 01 45 d0 e8 98 a2 ce fb 4c 89 fa 48 b8 00 00 00 00 >> >> RIP [< inline >] ktime_divns include/linux/ktime.h:195 >> >> RIP [<ffffffff858e1a6c>] snd_hrtimer_callback+0x1bc/0x3c0 >> >> sound/core/hrtimer.c:62 >> >> RSP <ffff88003ed07dd8> >> >> ---[ end trace e1a6d1e5a6136d50 ]--- >> >> >> >> >> >> or: >> >> >> >> INFO: rcu_sched detected stalls on CPUs/tasks: >> >> 0-...: (1 GPs behind) idle=203/140000000000001/0 softirq=923090/923091 fqs=6490 >> >> (detected by 1, t=26002 jiffies, g=508022, c=508021, q=20) >> >> Task dump for CPU 0: >> >> a.out R running task 28848 31756 4047 0x0000000a >> >> 0000000000000000 0000000000000282 0000000000000000 ffff880066a9b430 >> >> ffff880066a9b430 ffff8800667cfae0 0000000000000282 0000000000000000 >> >> 0000000000000000 0000000000000000 000003a0184674dd dffffc0000000000 >> >> Call Trace: >> >> [< inline >] spin_unlock_irqrestore ./include/linux/spinlock.h:362 >> >> [<ffffffff858db3dd>] snd_timer_start1+0xdd/0x740 sound/core/timer.c:477 >> >> [<ffffffff858dbb25>] snd_timer_continue+0x45/0x80 sound/core/timer.c:606 >> >> [< inline >] snd_timer_user_continue sound/core/timer.c:1835 >> >> [< inline >] __snd_timer_user_ioctl sound/core/timer.c:1902 >> >> [<ffffffff858df299>] snd_timer_user_ioctl+0xe99/0x2470 sound/core/timer.c:1917 >> >> [< inline >] vfs_ioctl fs/ioctl.c:43 >> >> [<ffffffff818a1dfc>] do_vfs_ioctl+0x18c/0x1080 fs/ioctl.c:675 >> >> [< inline >] SYSC_ioctl fs/ioctl.c:690 >> >> [<ffffffff818a2d7f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:681 >> >> [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1 >> >> arch/x86/entry/entry_64.S:208 >> >> >> >> >> >> >> >> I am on 0f98f121e1670eaa2a2fbb675e07d6ba7f0e146f of linux-next with >> >> the following diff in sound/core/timer.c (one is a fix for another >> >> division by 0): >> >> >> >> diff --git a/sound/core/timer.c b/sound/core/timer.c >> >> index 9a6157e..3e55c6d 100644 >> >> --- a/sound/core/timer.c >> >> +++ b/sound/core/timer.c >> >> @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, >> >> struct snd_timer_id *tid, >> >> timer->tmr_subdevice = tid->subdevice; >> >> if (id) >> >> strlcpy(timer->id, id, sizeof(timer->id)); >> >> + timer->sticks = 1; >> >> INIT_LIST_HEAD(&timer->device_list); >> >> INIT_LIST_HEAD(&timer->open_list_head); >> >> INIT_LIST_HEAD(&timer->active_list_head); >> >> @@ -1958,6 +1959,7 @@ static ssize_t snd_timer_user_read(struct file >> >> *file, char __user *buffer, >> >> tu->qused--; >> >> spin_unlock_irq(&tu->qlock); >> >> >> >> + mutex_lock(&tu->ioctl_lock); >> >> if (tu->tread) { >> >> if (copy_to_user(buffer, &tu->tqueue[qhead], >> >> sizeof(struct snd_timer_tread))) >> >> @@ -1967,6 +1969,7 @@ static ssize_t snd_timer_user_read(struct file >> >> *file, char __user *buffer, >> >> sizeof(struct snd_timer_read))) >> >> err = -EFAULT; >> >> } >> >> + mutex_unlock(&tu->ioctl_lock); >> >> >> >> spin_lock_irq(&tu->qlock); >> >> if (err < 0) >> >> >> > >> > I couldn't trigger the bug any longer after merging for-linus branch >> > of sound.git tree: >> > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git >> > >> > But I thought it got fixed by sticks=1 in the above. Maybe it's a >> > combination, as it contains yet another timer fix by Vegard. >> >> >> You mean that my local changes are a combination of two patches? But I >> would not expect that addition of mutex_lock/unlock can affect >> division by 0. >> Or you mean that there some other fix by Vegard? What is it? > > There is another fix > 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 > ALSA: timer: fix NULL pointer dereference on memory allocation failure > But it's likely irrelevant. > > I retested now, and could see the latter stack trace, the rcu tall, > while running the program, although it doesn't lead to a complete > freeze and the system is restored once after the program quits. Are you sure it get back to normal? RCU stall is printed when a CPU does not respond to timer interrupts for 20s. Try to do 'echo -n l > /proc/sysrq-trigger'. > Do you really see the zero-division error with sticks=1 fix? That's > unexpected. > > > Takashi
On Tue, 06 Sep 2016 16:06:08 +0200, Dmitry Vyukov wrote: > > > Do you really see the zero-division error with sticks=1 fix? That's > > unexpected. > > Yes, just reproduced it again. > I am on 0693c28cfc8e25f18c25b65a8942c026f1854a3c of linux-next now, so > I already have: > commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 > ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE > > Compiled the program verbatim and run it using stress utility (http > s://godoc.org/golang.org/x/tools/cmd/stress namely just runs the > program in parallel loop) and instantly get the following crash. OK, I'll take a deeper look. thanks, Takashi
diff --git a/sound/core/timer.c b/sound/core/timer.c index 9a6157e..3e55c6d 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -813,6 +813,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid, timer->tmr_subdevice = tid->subdevice; if (id) strlcpy(timer->id, id, sizeof(timer->id)); + timer->sticks = 1; INIT_LIST_HEAD(&timer->device_list); INIT_LIST_HEAD(&timer->open_list_head);