Message ID | 69b33ebdc9427b5c036b4ba09151ae88c14a30f4.1389090437.git.nicolas.ferre@atmel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 07, 2014 at 11:45:08AM +0100, Nicolas Ferre wrote: > From: Marek Roszko <mark.roszko@gmail.com> > > Something asks a tasklet to be scheduled when the uart port is closed. What is that something? Shouldn't you track that down and find the real problem here? > Need to supress the kernel panic for now by checking if the port is NULL or > not. > > Signed-off-by: Marek Roszko <mark.roszko@gmail.com> > Acked-by: Leilei Zhao <leilei.zhao@atmel.com> > Cc: <stable@vger.kernel.org> # v3.12 > Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> > --- > drivers/tty/serial/atmel_serial.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c > index c421d11b3d4c..6e68486c83cb 100644 > --- a/drivers/tty/serial/atmel_serial.c > +++ b/drivers/tty/serial/atmel_serial.c > @@ -1360,6 +1360,10 @@ static void atmel_tasklet_func(unsigned long data) > unsigned int status; > unsigned int status_change; > > + if(!port->state || !port->state->port.tty) > + /* uart has been closed */ > + return; Did you really test this? How about running it through checkpatch? thanks, greg k-h
A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On Tue, Jan 07, 2014 at 08:52:34PM -0500, Mark Roszko wrote: > This patch I was somewhat hesitant to pushing to Atmel when I did the other two > patches. Then why did you? :) > The main issue is the use of systemd causes the serial driver to somehow get > out of sync on startups randomly. i.e. on one bootup it's fine, and on other > it'll kernel panic. > It occurs because systemd causes the startup and shutdown driver ops to be > fired in rapid succession. How does systemd cause this? Is this when the serial port is being used as a console or something else? > Every single service start message causes the _startup callback first, then the > message prints and _shutdown callbacks fires. So something is opening and closing the port quickly? That should be easy to test without systemd. Any console involved? > And a kernel panic always occurs somewhere during the service status output, > never before when it's just the kernel startup and never after once systemd > finishes and getty for example takes over. > > The stacktrace looked like this: > Unable to handle kernel NULL pointer dereference at virtual address 00000088 > pgd = c0004000 > [00000088] *pgd=00000000 > Internal error: Oops: 17 [#1] ARM > Modules linked in: autofs4 > CPU: 0 Not tainted (3.6.9-yocto-standard #1) 3.6.9 is _very_ old, loads of things have happened in the tty layer since then, can you duplicate this on 3.12 or 3.13-rc? > PC is at tty_wakeup+0x8/0x58 > LR is at atmel_tasklet_func+0x238/0x80c > pc : [<c01efd84>] lr : [<c0208fc0>] psr: a00f0013 > sp : df84ff28 ip : 00000000 fp : 00000100 > r10: c05d1ec0 r9 : 04208040 r8 : c05d1e80 > r7 : 0000000a r6 : 00000000 r5 : dedf7c00 r4 : 00000000 > r3 : dedf7c00 r2 : 00000000 r1 : 600f0013 r0 : 00000000 > Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel > Control: 10c53c7d Table: 3fb0c059 DAC: 00000015 > Process ksoftirqd/0 (pid: 3, stack limit = 0xdf84e2f0) > Stack: (0xdf84ff28 to 0xdf850000) > ff20: c05e4378 dedf7c00 00000000 c0208fc0 c05aa458 df8496b0 > ff40: df849680 c05aa458 df8496b0 c00394a8 c0039420 c05a8d04 00000000 00000000 > ff60: 0000000a c05d1e80 04208040 c05d1ec0 00000100 c001fd34 00000009 00000018 > ff80: df84e000 c001f60c df84ffbc c03f8e60 00000013 00000006 00000000 c05d1e80 > ffa0: df84e000 00000000 00000013 00000000 00000000 00000000 00000000 c001f728 > ffc0: df84bf3c 00000000 c001f6c0 c0030870 00000000 00000000 00000000 00000000 > ffe0: df84ffe0 df84ffe0 df84bf3c c00307f0 c000e348 c000e348 fff73fbf 3fbeffff > [<c01efd84>] (tty_wakeup+0x8/0x58) from [<c0208fc0>] (atmel_tasklet_func+0x238/ > 0x80c) > [<c0208fc0>] (atmel_tasklet_func+0x238/0x80c) from [<c001fd34>] > (tasklet_action+0x70/0xa8) > [<c001fd34>] (tasklet_action+0x70/0xa8) from [<c001f60c>] (__do_softirq+0x90/ > 0x144) > [<c001f60c>] (__do_softirq+0x90/0x144) from [<c001f728>] (run_ksoftirqd+0x68/ > 0x104) > [<c001f728>] (run_ksoftirqd+0x68/0x104) from [<c0030870>] (kthread+0x80/0x90) > [<c0030870>] (kthread+0x80/0x90) from [<c000e348>] (kernel_thread_exit+0x0/0x8) > Code: e8bd8070 c05ac0b8 e92d4070 e1a04000 (e5903088) > ---[ end trace 15b8a9aeaf7b457f ]--- > > > Testing wise I originally used a BUG_ON statement in atmel_tasklet_func to > panic before the null deference hit. BUG_ON confirmed that tty was NULL at the > very start of the tasklet being called. And did you test that this patch actually fixed it? > The atmel_shutdown function should be killing the tasklet (after patch "Handle > shutdown more safely") and does disable interrupts so I've been at a loss at > where the race condition was occurring that a tasklet could escape beyond > shutdown. Are you sure you aren't just racing with shutdown? Need a lock somewhere? greg k-h
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c index c421d11b3d4c..6e68486c83cb 100644 --- a/drivers/tty/serial/atmel_serial.c +++ b/drivers/tty/serial/atmel_serial.c @@ -1360,6 +1360,10 @@ static void atmel_tasklet_func(unsigned long data) unsigned int status; unsigned int status_change; + if(!port->state || !port->state->port.tty) + /* uart has been closed */ + return; + /* The interrupt handler does not take the lock */ spin_lock(&port->lock);