Message ID | 5a7b8b7b0904061453m725eee67m4b74c73bbffeb994@mail.gmail.com (mailing list archive) |
---|---|
State | Awaiting Upstream, archived |
Headers | show |
On Monday 06 April 2009, Hugo Vincent wrote: > > Here are some of the crashes I've seen. Warning - I'm new to -rt and > to the linux-omap tree, so I'll apologize in advance if these are just > a result of me missing something obvious. Thanks, I'll take a look. > Context: using USB-gadget ethernet (g_ether) over musb (configured as > just peripheral, not OTG). Most of the time it works, but after a > while, or when it encounters a large packet (e.g fping -b30000) it > crashes like so (apparently something to do with receive DMA usage). > Although the fping -b case is kindof pathological, I've seen the same > crash when ssh/scping files across to the board, so it does occur in > real usage. RX DMA is troublesome with the MUSB code, and there are some bugfixes pending which should affect it. (Posted to linux-usb over the last week or two.) Do these problems show up with DMA disabled? - Dave -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 7, 2009 at 10:20 AM, David Brownell <david-b@pacbell.net> wrote: > On Monday 06 April 2009, Hugo Vincent wrote: >> >> Here are some of the crashes I've seen. Warning - I'm new to -rt and >> to the linux-omap tree, so I'll apologize in advance if these are just >> a result of me missing something obvious. > > Thanks, I'll take a look. Excellent, thank you! >> Context: using USB-gadget ethernet (g_ether) over musb (configured as >> just peripheral, not OTG). Most of the time it works, but after a >> while, or when it encounters a large packet (e.g fping -b30000) it >> crashes like so (apparently something to do with receive DMA usage). >> Although the fping -b case is kindof pathological, I've seen the same >> crash when ssh/scping files across to the board, so it does occur in >> real usage. > > RX DMA is troublesome with the MUSB code, and there are some bugfixes > pending which should affect it. Â (Posted to linux-usb over the last > week or two.) > > Do these problems show up with DMA disabled? I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I don't observe the Rx DMA problems as described previously in that configuration, no ........ However I instead I see hard-lockups with no obvious cause and no messages printed to the console. (The only repeatable one I've found so far is exiting cyclictest with ^C. Not good). These lockups could well be and probably are unrelated, but none-the-less, it's very troubling! I've uploaded my defconfig here: http://hugovincent.com/files/lkml-20090407/ along with a log of the bootup process. I've enabled a bunch of debug options and self-tests. You can see the rather verbose output of some spinlock tests (I think) failing early in the bootup process, somewhere in timer initialization (I think?). Regards, Hugo -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > RX DMA is troublesome with the MUSB code, and there are some bugfixes > > pending which should affect it. Â (Posted to linux-usb over the last > > week or two.) > > > > Do these problems show up with DMA disabled? > > I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I > don't observe the Rx DMA problems as described previously in that > configuration, no ........ Good... > However I instead I see hard-lockups with > no obvious cause and no messages printed to the console. (The only > repeatable one I've found so far is exiting cyclictest with ^C. Not > good). These lockups could well be and probably are unrelated, but > none-the-less, it's very troubling! Given the bootlog excerpt you posted, I'm thinking there are still some basic goofy things with the RT patches you're using even outside the scope of MUSB. Basic as in timer tick and kernel thread setup; lots of things look goofy. > I've uploaded my defconfig here: > http://hugovincent.com/files/lkml-20090407/ > along with a log of the bootup process. I've enabled a bunch of debug > options and self-tests. You can see the rather verbose output of some > spinlock tests (I think) failing early in the bootup process, > somewhere in timer initialization (I think?). Those look like lockdep things. Get a more complete boot log and maybe someone will be able to sort out what's up. - Dave > > Regards, > Hugo > > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 7, 2009 at 2:27 PM, David Brownell <david-b@pacbell.net> wrote: > >> > RX DMA is troublesome with the MUSB code, and there are some bugfixes >> > pending which should affect it. Â (Posted to linux-usb over the last >> > week or two.) >> > >> > Do these problems show up with DMA disabled? >> >> I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I >> don't observe the Rx DMA problems as described previously in that >> configuration, no ........ > > Good... So based on what you've seen, do you think the queued patches for MUSB you mentioned will fix these problems? >> However I instead I see hard-lockups with >> no obvious cause and no messages printed to the console. (The only >> repeatable one I've found so far is exiting cyclictest with ^C. Not >> good). These lockups could well be and probably are unrelated, but >> none-the-less, it's very troubling! > > Given the bootlog excerpt you posted, I'm thinking there are > still some basic goofy things with the RT patches you're using > even outside the scope of MUSB. Â Basic as in timer tick and > kernel thread setup; lots of things look goofy. > Here is a complete boot log + config: http://hugovincent.com/files/lkml-20090407/boot2.log Who is working on OMAP3 -rt? Does anyone have any tips for tracking down the apparent basic problems David mentioned (timer tick, kernel thread setup)? Thanks, Hugo -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 06 April 2009, Hugo Vincent wrote: > Here is a complete boot log + config: > http://hugovincent.com/files/lkml-20090407/boot2.log Erm, not very complete actually. Enable DEBUG_LL to see more early messages ... like the ones starting right after the kernel decompression messages. Also, re those udev-induced messages: Remounting root file system... uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0 Buffer I/O error on device mtdblock0, logical block 0 uncorrectable error : <3>uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 8 Buffer I/O error on device mtdblock0, logical block 1 end_request: I/O error, dev mtdblock0, sector 16 Buffer I/O error on device mtdblock0, logical block 2 uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 24 Buffer I/O error on device mtdblock0, logical block 3 uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0 Buffer I/O error on device mtdblock0, logical block 0 You shouldn't need mtdblock unless you run JFSS2, so that's the quick way to get rid of them: take that out of your kernel configuration. Else, add "mtdblock*" to the "KERNEL==... ; goto persistent_storage_end" check in /etc/udev/rules.d/60-persistent-storage.rules I understand the next version of udev will fix that. - Dave -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 7, 2009 at 3:36 PM, David Brownell <david-b@pacbell.net> wrote: > On Monday 06 April 2009, Hugo Vincent wrote: >> Here is a complete boot log + config: >> http://hugovincent.com/files/lkml-20090407/boot2.log > > Erm, not very complete actually. Â Enable DEBUG_LL to see > more early messages ... like the ones starting right > after the kernel decompression messages. How's this? http://hugovincent.com/files/lkml-20090407/boot3.log (In case it isn't clear, we get messages twice because console and debug_LL are on the same serial port. I also had to increase CONFIG_LOG_BUF_SHIFT to get a complete log in dmesg.) > Also, re those udev-induced messages: > > Remounting root file system... > uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0 > Buffer I/O error on device mtdblock0, logical block 0 > uncorrectable error : <3>uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 8 > Buffer I/O error on device mtdblock0, logical block 1 > end_request: I/O error, dev mtdblock0, sector 16 > Buffer I/O error on device mtdblock0, logical block 2 > uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 24 > Buffer I/O error on device mtdblock0, logical block 3 > uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0 > Buffer I/O error on device mtdblock0, logical block 0 > > You shouldn't need mtdblock unless you run JFSS2, so that's > the quick way to get rid of them: Â take that out of your > kernel configuration. Â Else, add "mtdblock*" to > the "KERNEL==... ; goto persistent_storage_end" check in > > Â /etc/udev/rules.d/60-persistent-storage.rules > > I understand the next version of udev will fix that. Thanks for that. Unfortunately I am using JFFS2 for now - perhaps I should switch to UBIFS. -- Hugo -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 7, 2009 at 4:19 PM, Hugo Vincent <hugo.vincent@gmail.com> wrote: > On Tue, Apr 7, 2009 at 3:36 PM, David Brownell <david-b@pacbell.net> wrote: >> On Monday 06 April 2009, Hugo Vincent wrote: >>> Here is a complete boot log + config: >>> http://hugovincent.com/files/lkml-20090407/boot2.log >> >> Erm, not very complete actually. Â Enable DEBUG_LL to see >> more early messages ... like the ones starting right >> after the kernel decompression messages. > > How's this? > http://hugovincent.com/files/lkml-20090407/boot3.log > > (In case it isn't clear, we get messages twice because console and > debug_LL are on the same serial port. I also had to increase > CONFIG_LOG_BUF_SHIFT to get a complete log in dmesg.) Can anyone give me any pointers on where to start for fixing the problems shown in the above boot log? It looks like some fairly low level locking bugs (spinlock vs raw_spinlock maybe?) in twl4030 IRQ handling and GP timer/clock event source setup. Hugo -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c index 0d4f4c6..122b694 100644 --- a/drivers/serial/8250.c +++ b/drivers/serial/8250.c @@ -1073,6 +1073,7 @@ static void autoconfig(struct uart_8250_port *up, unsigned int probeflags) unsigned char status1, scratch, scratch2, scratch3; unsigned char save_lcr, save_mcr; unsigned long flags; + DEFINE_RAW_SPINLOCK(raw_lock); if (!up->port.iobase && !up->port.mapbase && !up->port.membase) return; @@ -1085,6 +1086,7 @@ static void autoconfig(struct uart_8250_port *up, unsigned int probeflags) * be frobbing the chips IRQ enable register to see if it exists. */ spin_lock_irqsave(&up->port.lock, flags); + spin_lock_irqsave(&raw_lock, flags); up->capabilities = 0; up->bugs = 0; @@ -1240,6 +1242,7 @@ static void autoconfig(struct uart_8250_port *up, unsigned int probeflags) serial_outp(up, UART_IER, 0); out: + spin_unlock_irqrestore(&raw_lock, flags); spin_unlock_irqrestore(&up->port.lock, flags); DEBUG_AUTOCONF("type=%s\n", uart_config[up->port.type].name); }