diff mbox

Preempt-RT on OMAP3?

Message ID 5a7b8b7b0904061453m725eee67m4b74c73bbffeb994@mail.gmail.com (mailing list archive)
State Awaiting Upstream, archived
Headers show

Commit Message

Hugo Vincent April 6, 2009, 9:53 p.m. UTC
Here are some of the crashes I've seen. Warning - I'm new to -rt and
to the linux-omap tree, so I'll apologize in advance if these are just
a result of me missing something obvious.

Context: using USB-gadget ethernet (g_ether) over musb (configured as
just peripheral, not OTG). Most of the time it works, but after a
while, or when it encounters a large packet (e.g fping -b30000) it
crashes like so (apparently something to do with receive DMA usage).
Although the fping -b case is kindof pathological, I've seen the same
crash when ssh/scping files across to the board, so it does occur in
real usage.


Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in: g_ether ftdi_sio usbserial ipv6
CPU: 0    Not tainted  (2.6.29-rt1-omap1-g7648048 #1)
PC is at dma_channel_program+0x90/0x108
LR is at rxstate+0xc8/0x1b4
pc : [<c024c7c0>]    lr : [<c024b760>]    psr: 60000013
sp : cf93be10  ip : cf93be58  fp : cf93be54
r10: 00000000  r9 : 00000000  r8 : 8fa1d002
r7 : 00000200  r6 : cf98b668  r5 : cf98b668  r4 : cf98b600
r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : cf98b668
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 8fa90019  DAC: 00000017
Process IRQ-93 (pid: 191, stack limit = 0xcf93a2e8)
Stack: (0xcf93be10 to 0xcf93c000)
be00:                                     bf062078 c024b8f4 00000000 cf05f3e0
be20: cf85f8e0 cf85f8e0 00000000 cf98b600 cf85f5e0 cf83b000 00000000 00002003
be40: d80ab110 00000001 cf93be94 cf93be58 c024b760 c024c73c 00000000 c02f8d18
be60: cf83b000 60000113 cf93a000 00002003 cf85f5e0 cf83b000 00000003 cf98b668
be80: d80ab110 00000001 cf93bedc cf93be98 c024bfc0 c024b6a4 00000003 00000002
bea0: cf93a000 cf94537c cf83b19c cf83b1d8 c004d80c cf98b650 00000002 00000001
bec0: 8fa1dc02 d80ab000 cf83b000 cf98b600 cf93beec cf93bee0 c02489e4 c024bc9c
bee0: cf93bf3c cf93bef0 c024c988 c024899c 00000002 cf820000 ffffffff c040caa4
bf00: af2001b4 cf94537c d80ab060 00000004 c006b43c c03ebbc8 cf93a000 cf8c09e0
bf20: 0000005d 00000000 0000005d 00000000 cf93bf74 cf93bf40 c007cef0 c024c844
bf40: cfa29e40 00000000 cf93bf84 c03ebbc8 cf93a000 0000005d 0000005d cf8c09e0
bf60: c03ebc20 c04186a4 cf93bf9c cf93bf78 c007d33c c007ce30 c03ebbc8 cf93a000
bf80: 0000005d 00000000 60000113 c03ebc08 cf93bfd4 cf93bfa0 c007d45c c007d2dc
bfa0: 00000000 00000032 00000000 cf93a000 c03ebbc8 c007d390 00000000 00000000
bfc0: 00000000 00000000 cf93bff4 cf93bfd8 c0065b54 c007d39c 00000000 00000000
bfe0: 00000000 00000000 00000000 cf93bff8 c0053b28 c0065b04 017bee08 08f7ef39
Backtrace:
[<c024c730>] (dma_channel_program+0x0/0x108) from [<c024b760>]
(rxstate+0xc8/0x1b4)
[<c024b698>] (rxstate+0x0/0x1b4) from [<c024bfc0>] (musb_g_rx+0x330/0x3ac)
[<c024bc90>] (musb_g_rx+0x0/0x3ac) from [<c02489e4>]
(musb_dma_completion+0x54/0x58)
[<c0248990>] (musb_dma_completion+0x0/0x58) from [<c024c988>]
(dma_controller_irq+0x150/0x18c)
[<c024c838>] (dma_controller_irq+0x0/0x18c) from [<c007cef0>]
(handle_IRQ_event+0xcc/0x1d8)
[<c007ce24>] (handle_IRQ_event+0x0/0x1d8) from [<c007d33c>]
(thread_simple_irq+0x6c/0xc0)
[<c007d2d0>] (thread_simple_irq+0x0/0xc0) from [<c007d45c>] (do_irqd+0xcc/0x334)
[<c007d390>] (do_irqd+0x0/0x334) from [<c0065b54>] (kthread+0x5c/0x94)
[<c0065af8>] (kthread+0x0/0x94) from [<c0053b28>] (do_exit+0x0/0x73c)
 r6:00000000 r5:00000000 r4:00000000
Code: 13a03000 03a03001 1a000002 e3a03000 (e5833000)
---[ end trace af756bf803843539 ]---


After the above crash has been seen, reboot/shutdown or rmmod do this:


Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = cf054000
[00000000] *pgd=8f322031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#2] PREEMPT
Modules linked in: ftdi_sio usbserial ipv6
CPU: 0    Tainted: G      D     (2.6.29-rt1-omap1-g8dde6bd-dirty #1)
PC is at plist_add+0x5c/0xb0
LR is at task_blocks_on_rt_mutex+0x144/0x204
pc : [<c01bd098>]    lr : [<c0074690>]    psr: 60000093
sp : cfb6dd58  ip : cfb6dd70  fp : cfb6dd6c
r10: cfb6ddbc  r9 : 60000013  r8 : cfb6ddb0
r7 : cf1d0d60  r6 : cfb6ddb0  r5 : cf1d111c  r4 : cfb6ddc4
r3 : fffffffc  r2 : cfb6ddd0  r1 : cf1d111c  r0 : fffffffc
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8f054019  DAC: 00000015
Process reboot (pid: 1335, stack limit = 0xcfb6c2e8)
Stack: (0xcfb6dd58 to 0xcfb6e000)
dd40:                                                       cfb6c000 cf1d111c
dd60: cfb6ddac cfb6dd70 c0074690 c01bd048 00000001 c040e320 00000000 cf83b0d0
dd80: cfa144e0 cf83b0d0 60000013 cfb6c000 0000006c 00000000 00000000 00000000
dda0: cfb6de0c cfb6ddb0 c02f55ac c0074558 00000078 cf83b0d0 cf83b0d0 cf83b0d8
ddc0: cf83b0d8 00000078 cfb6ddc8 cfb6ddc8 cfb6ddd0 cfb6ddd0 cfaba620 cf83b0d0
dde0: 0000006c cf83b0d0 00000000 fee1dead 0000006c c002d0c4 cfb6c000 00000000
de00: cfb6de1c cfb6de10 c02f60f4 c02f54d4 cfb6de34 cfb6de20 c0241964 c02f60c0
de20: c03e1148 c041eaa8 cfb6de44 cfb6de38 c01ee7a0 c0241950 cfb6de5c cfb6de48
de40: c01e9f0c c01ee78c 00000000 28121969 cfb6de6c cfb6de60 c00604dc c01e9eb8
de60: cfb6de84 cfb6de70 c0060530 c00604bc 01234567 01234567 cfb6dfa4 cfb6de88
de80: c00606ec c0060524 cfb6de84 00000000 c00bfa7c cf525e30 cf525e30 c03f0c2c
dea0: 00200200 00100100 cfb6decc cfb6deb8 c00bbb7c c00a7dc4 cfb6dee4 cf525e30
dec0: cfb6dee4 cfb6ded0 c00bbbcc c00bbb30 cf525e30 cf401548 cfb6defc cfb6dee8
dee0: c00bbc3c cf80e688 cf80e620 cf4078a0 cfb6df24 cfb6df00 c00c2f48 c0074a04
df00: c007ff60 cf1aea40 00000008 cf4078a0 cf525e30 cf80e620 cfb6df54 cfb6df28
df20: c00ad63c c00c2f28 00000000 c02f5420 cfb6df5c cf1aea40 00000000 cf881b00
df40: cf1aea40 c002d0c4 cfb6df64 cfb6df58 c00ad67c c00ad480 cfb6df84 cfb6df68
df60: c00aa328 c00ad658 cfb6df84 00000003 cf881b00 cf881b24 cfb6dfa4 cfb6df88
df80: c00aa3dc c00aa2b8 00000000 00000001 00000004 00000058 00000000 cfb6dfa8
dfa0: c002cf40 c0060588 00000000 00000001 fee1dead 28121969 01234567 0000006c
dfc0: 00000000 00000001 00000004 00000058 00000001 00000001 00000000 00000001
dfe0: 400d8620 befd2cc0 00009210 400d8638 60000010 fee1dead cfb6dff4 00000000
Backtrace:
[<c01bd03c>] (plist_add+0x0/0xb0) from [<c0074690>]
(task_blocks_on_rt_mutex+0x144/0x204)
 r5:cf1d111c r4:cfb6c000
[<c007454c>] (task_blocks_on_rt_mutex+0x0/0x204) from [<c02f55ac>]
(rt_spin_lock_slowlock+0xe4/0x27c)
[<c02f54c8>] (rt_spin_lock_slowlock+0x0/0x27c) from [<c02f60f4>]
(rt_spin_lock+0x40/0x44)
[<c02f60b4>] (rt_spin_lock+0x0/0x44) from [<c0241964>] (musb_shutdown+0x20/0x68)
[<c0241944>] (musb_shutdown+0x0/0x68) from [<c01ee7a0>]
(platform_drv_shutdown+0x20/0x24)
 r5:c041eaa8 r4:c03e1148
[<c01ee780>] (platform_drv_shutdown+0x0/0x24) from [<c01e9f0c>]
(device_shutdown+0x60/0xb8)
[<c01e9eac>] (device_shutdown+0x0/0xb8) from [<c00604dc>]
(kernel_restart_prepare+0x2c/0x3c)
 r5:28121969 r4:00000000
[<c00604b0>] (kernel_restart_prepare+0x0/0x3c) from [<c0060530>]
(kernel_restart+0x18/0x4c)
[<c0060518>] (kernel_restart+0x0/0x4c) from [<c00606ec>]
(sys_reboot+0x170/0x1dc)
 r4:01234567
[<c006057c>] (sys_reboot+0x0/0x1dc) from [<c002cf40>]
(ret_fast_syscall+0x0/0x2c)
 r7:00000058 r6:00000004 r5:00000001 r4:00000000
Code: ba000007 e1a00001 0a00000c e1a03000 (e5b32004)
---[ end trace d7d34d233af69b06 ]---
note: reboot[1335] exited with preempt_count 2
Segmentation fault

Finally, there is this crash (when musb is configured as OTG). When
you try to write to several of the sysfs entries, e.g.
/sys/devices/platform/musb_hdrc/mode, the driver crashes. Note that
this one doesn't appear to be limited to just -rt.

root@overo-grc:/sys/devices/platform/musb_hdrc# echo host > mode
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = cfbe4000
[00000000] *pgd=8fbaf031, *pte=00000000, *ppte=00000000
Internal error: Oops: 0 [#1] PREEMPT
Modules linked in: ftdi_sio usbserial ipv6
CPU: 0    Not tainted  (2.6.29-rt1-omap1-g8dde6bd-dirty #1)
PC is at 0x0
LR is at musb_platform_set_mode+0x50/0x70
pc : [<00000000>]    lr : [<c0243064>]    psr: 60000013
sp : cf0e3ed8  ip : cf83b0d0  fp : cf0e3ee4
r10: cf0e3f70  r9 : cfbf9238  r8 : c03ff1e8
r7 : c03e15ac  r6 : 00000005  r5 : cf83b0d0  r4 : cfaa7000
r3 : 00000081  r2 : d80ab000  r1 : cf83b000  r0 : cf83b1c4
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8fbe4019  DAC: 00000015
Process sh (pid: 1233, stack limit = 0xcf0e22e8)
Stack: (0xcf0e3ed8 to 0xcf0e4000)
3ec0:                                                       cf0e3f04 cf0e3ee8
3ee0: c0241ce0 c0243020 cf07bc00 00000005 cf92ef08 cfbf9220 cf0e3f14 cf0e3f08
3f00: c01e9dfc c0241c68 cf0e3f44 cf0e3f18 c00ef7bc c01e9de4 00000000 cf07bc00
3f20: 4001e000 cf0e3f70 00000005 00000005 cf0e2000 00000001 cf0e3f6c cf0e3f48
3f40: c00aca90 c00ef6b8 cf07bc00 00000000 00000000 00000000 cf07bc00 4001e000
3f60: cf0e3fa4 cf0e3f70 c00acbe4 c00ac9e8 00000000 00000000 cf0e3fa4 00000000
3f80: c00b8fe8 00000005 4001e000 401ac600 00000004 c002d0c4 00000000 cf0e3fa8
3fa0: c002cf40 c00acbac 00000005 4001e000 00000001 4001e000 00000005 00000000
3fc0: 00000005 4001e000 401ac600 00000004 00000005 000a93f0 00000001 000a9008
3fe0: 00000000 bea345e8 400f578c 4014299c 60000010 00000001 00000000 00000000
Backtrace:
[<c0243014>] (musb_platform_set_mode+0x0/0x70) from [<c0241ce0>]
(musb_mode_store+0x84/0xac)
[<c0241c5c>] (musb_mode_store+0x0/0xac) from [<c01e9dfc>]
(dev_attr_store+0x24/0x28)
 r6:cfbf9220 r5:cf92ef08 r4:00000005
[<c01e9dd8>] (dev_attr_store+0x0/0x28) from [<c00ef7bc>]
(sysfs_write_file+0x110/0x144)
[<c00ef6ac>] (sysfs_write_file+0x0/0x144) from [<c00aca90>]
(vfs_write+0xb4/0x144)
[<c00ac9dc>] (vfs_write+0x0/0x144) from [<c00acbe4>] (sys_write+0x44/0x70)
 r7:4001e000 r6:cf07bc00 r5:00000000 r4:00000000
[<c00acba0>] (sys_write+0x0/0x70) from [<c002cf40>] (ret_fast_syscall+0x0/0x2c)
 r8:c002d0c4 r7:00000004 r6:401ac600 r5:4001e000 r4:00000005
Code: bad PC value.
---[ end trace d7d34d233af69b05 ]---


I'm building git 90e758af52ba803cba233fabee81176d99589f09 (2.6.29
final) using openembedded linux-omap recipe. My config is attached. I
have applied the -rt3 (or -rt2 or -rt1 - same errors) patch from
http://www.kernel.org/pub/linux/kernel/projects/rt/. I can't get the
2.6.29.1-rtX patch to apply cleanly because I can't figure out which
git revision corresponds to 2.6.29.1....



There is one other oops I see during 8250/16550 serial port
initialization with -rt, but that was easily fixed (see attached
patch). As I said, I'm new to RT so I've got no idea if the way I did
it in the patch is valid, but it seems to work, and seeing as it's
only during initialization, it seems fairly safe.

Serial: 8250/16550 driver, 8 ports, IRQ sharing enabled
irq 72: nobody cared (try booting with the "irqpoll" option)
[<c02f6634>] (dump_stack+0x0/0x14) from [<c007e1d4>]
(__report_bad_irq+0x3c/0x98)
[<c007e198>] (__report_bad_irq+0x0/0x98) from [<c007e380>]
(note_interrupt+0x150/0x1d8)
 r4:c03eb2ec
[<c007e230>] (note_interrupt+0x0/0x1d8) from [<c007d370>]
(thread_simple_irq+0xa0/0xc0)
[<c007d2d0>] (thread_simple_irq+0x0/0xc0) from [<c007d45c>] (do_irqd+0xcc/0x334)
[<c007d390>] (do_irqd+0x0/0x334) from [<c0065b54>] (kthread+0x5c/0x94)
[<c0065af8>] (kthread+0x0/0x94) from [<c0053b28>] (do_exit+0x0/0x73c)
 r6:00000000 r5:00000000 r4:00000000
handlers:
[<c0036164>] (omap_uart_interrupt+0x0/0x1c)
Disabling IRQ #72
serial8250.0: ttyS0 at MMIO 0x4806a000 (irq = 72) is a ST16654
irq 73: nobody cared (try booting with the "irqpoll" option)
[<c02f6634>] (dump_stack+0x0/0x14) from [<c007e1d4>]
(__report_bad_irq+0x3c/0x98)
[<c007e198>] (__report_bad_irq+0x0/0x98) from [<c007e380>]
(note_interrupt+0x150/0x1d8)
 r4:c03eb358
[<c007e230>] (note_interrupt+0x0/0x1d8) from [<c007d370>]
(thread_simple_irq+0xa0/0xc0)
[<c007d2d0>] (thread_simple_irq+0x0/0xc0) from [<c007d45c>] (do_irqd+0xcc/0x334)
[<c007d390>] (do_irqd+0x0/0x334) from [<c0065b54>] (kthread+0x5c/0x94)
[<c0065af8>] (kthread+0x0/0x94) from [<c0053b28>] (do_exit+0x0/0x73c)
 r6:00000000 r5:00000000 r4:00000000
handlers:
[<c0036164>] (omap_uart_interrupt+0x0/0x1c)
Disabling IRQ #73
serial8250.0: ttyS1 at MMIO 0x4806c000 (irq = 73) is a ST16654
serial8250.0: ttyS2 at MMIO 0x49020000 (irq = 74) is a ST16654
console [ttyS2] enabled


Thanks in advance,
Hugo Vincent


On Mon, Apr 6, 2009 at 10:25 PM, David Brownell <david-b@pacbell.net> wrote:
> On Sunday 05 April 2009, Hugo Vincent wrote:
>> I'm trying to get the realtime patch set to work on OMAP3 (Gumstix
>> Overo). Linux-omap 2.6.29 with the RT patches -rt1 or -rt2 work mostly
>> as is (MUSB and the usb-gadget layer have quite a few problems, and
>
> Like what?  Way back in 2.6.10 MUSB behaved relatively sanely.
> But that's a while back.  ;)
>
>
>> the realtime self-test stuff picks up a few problems to do with
>> spinlock usage etc, but most things seem to mostly work correctly for
>> my needs). Running cyclictest with various types of loads gives an
>> average latency of ~30-50 usec and worst case up around 300 usec.
>
>
>

Comments

David Brownell April 6, 2009, 10:20 p.m. UTC | #1
On Monday 06 April 2009, Hugo Vincent wrote:
> 
> Here are some of the crashes I've seen. Warning - I'm new to -rt and
> to the linux-omap tree, so I'll apologize in advance if these are just
> a result of me missing something obvious.

Thanks, I'll take a look.


> Context: using USB-gadget ethernet (g_ether) over musb (configured as
> just peripheral, not OTG). Most of the time it works, but after a
> while, or when it encounters a large packet (e.g fping -b30000) it
> crashes like so (apparently something to do with receive DMA usage).
> Although the fping -b case is kindof pathological, I've seen the same
> crash when ssh/scping files across to the board, so it does occur in
> real usage.

RX DMA is troublesome with the MUSB code, and there are some bugfixes
pending which should affect it.  (Posted to linux-usb over the last
week or two.)

Do these problems show up with DMA disabled?

- Dave
 


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hugo Vincent April 7, 2009, 2:12 a.m. UTC | #2
On Tue, Apr 7, 2009 at 10:20 AM, David Brownell <david-b@pacbell.net> wrote:
> On Monday 06 April 2009, Hugo Vincent wrote:
>>
>> Here are some of the crashes I've seen. Warning - I'm new to -rt and
>> to the linux-omap tree, so I'll apologize in advance if these are just
>> a result of me missing something obvious.
>
> Thanks, I'll take a look.

Excellent, thank you!

>> Context: using USB-gadget ethernet (g_ether) over musb (configured as
>> just peripheral, not OTG). Most of the time it works, but after a
>> while, or when it encounters a large packet (e.g fping -b30000) it
>> crashes like so (apparently something to do with receive DMA usage).
>> Although the fping -b case is kindof pathological, I've seen the same
>> crash when ssh/scping files across to the board, so it does occur in
>> real usage.
>
> RX DMA is troublesome with the MUSB code, and there are some bugfixes
> pending which should affect it.  (Posted to linux-usb over the last
> week or two.)
>
> Do these problems show up with DMA disabled?

I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I
don't observe the Rx DMA problems as described previously in that
configuration, no ........ However I instead I see hard-lockups with
no obvious cause and no messages printed to the console. (The only
repeatable one I've found so far is exiting cyclictest with ^C. Not
good). These lockups could well be and probably are unrelated, but
none-the-less, it's very troubling!

I've uploaded my defconfig here:
http://hugovincent.com/files/lkml-20090407/
along with a log of the bootup process. I've enabled a bunch of debug
options and self-tests. You can see the rather verbose output of some
spinlock tests (I think) failing early in the bootup process,
somewhere in timer initialization (I think?).

Regards,
Hugo
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Brownell April 7, 2009, 2:27 a.m. UTC | #3
> > RX DMA is troublesome with the MUSB code, and there are some bugfixes
> > pending which should affect it.  (Posted to linux-usb over the last
> > week or two.)
> >
> > Do these problems show up with DMA disabled?
> 
> I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I
> don't observe the Rx DMA problems as described previously in that
> configuration, no ........ 

Good...


> However I instead I see hard-lockups with 
> no obvious cause and no messages printed to the console. (The only
> repeatable one I've found so far is exiting cyclictest with ^C. Not
> good). These lockups could well be and probably are unrelated, but
> none-the-less, it's very troubling!

Given the bootlog excerpt you posted, I'm thinking there are
still some basic goofy things with the RT patches you're using
even outside the scope of MUSB.  Basic as in timer tick and
kernel thread setup; lots of things look goofy.


 
> I've uploaded my defconfig here:
> http://hugovincent.com/files/lkml-20090407/
> along with a log of the bootup process. I've enabled a bunch of debug
> options and self-tests. You can see the rather verbose output of some
> spinlock tests (I think) failing early in the bootup process,
> somewhere in timer initialization (I think?).

Those look like lockdep things.  Get a more complete boot log
and maybe someone will be able to sort out what's up.

- Dave



> 
> Regards,
> Hugo
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hugo Vincent April 7, 2009, 2:44 a.m. UTC | #4
On Tue, Apr 7, 2009 at 2:27 PM, David Brownell <david-b@pacbell.net> wrote:
>
>> > RX DMA is troublesome with the MUSB code, and there are some bugfixes
>> > pending which should affect it.  (Posted to linux-usb over the last
>> > week or two.)
>> >
>> > Do these problems show up with DMA disabled?
>>
>> I guess you mean with CONFIG_MUSB_PIO_ONLY - yes, I've tried that. I
>> don't observe the Rx DMA problems as described previously in that
>> configuration, no ........
>
> Good...

So based on what you've seen, do you think the queued patches for MUSB
you mentioned will fix these problems?

>> However I instead I see hard-lockups with
>> no obvious cause and no messages printed to the console. (The only
>> repeatable one I've found so far is exiting cyclictest with ^C. Not
>> good). These lockups could well be and probably are unrelated, but
>> none-the-less, it's very troubling!
>
> Given the bootlog excerpt you posted, I'm thinking there are
> still some basic goofy things with the RT patches you're using
> even outside the scope of MUSB.  Basic as in timer tick and
> kernel thread setup; lots of things look goofy.
>

Here is a complete boot log + config:
http://hugovincent.com/files/lkml-20090407/boot2.log

Who is working on OMAP3 -rt? Does anyone have any tips for tracking
down the apparent basic problems David mentioned (timer tick, kernel
thread setup)?

Thanks,
Hugo
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Brownell April 7, 2009, 3:36 a.m. UTC | #5
On Monday 06 April 2009, Hugo Vincent wrote:
> Here is a complete boot log + config:
> http://hugovincent.com/files/lkml-20090407/boot2.log

Erm, not very complete actually.  Enable DEBUG_LL to see
more early messages ... like the ones starting right
after the kernel decompression messages.

Also, re those udev-induced messages:

Remounting root file system...
uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0
Buffer I/O error on device mtdblock0, logical block 0
uncorrectable error : <3>uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 8
Buffer I/O error on device mtdblock0, logical block 1
end_request: I/O error, dev mtdblock0, sector 16
Buffer I/O error on device mtdblock0, logical block 2
uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 24
Buffer I/O error on device mtdblock0, logical block 3
uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0
Buffer I/O error on device mtdblock0, logical block 0

You shouldn't need mtdblock unless you run JFSS2, so that's
the quick way to get rid of them:  take that out of your
kernel configuration.  Else, add "mtdblock*" to
the "KERNEL==... ; goto persistent_storage_end" check in

 /etc/udev/rules.d/60-persistent-storage.rules

I understand the next version of udev will fix that.

- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hugo Vincent April 7, 2009, 4:19 a.m. UTC | #6
On Tue, Apr 7, 2009 at 3:36 PM, David Brownell <david-b@pacbell.net> wrote:
> On Monday 06 April 2009, Hugo Vincent wrote:
>> Here is a complete boot log + config:
>> http://hugovincent.com/files/lkml-20090407/boot2.log
>
> Erm, not very complete actually.  Enable DEBUG_LL to see
> more early messages ... like the ones starting right
> after the kernel decompression messages.

How's this?
http://hugovincent.com/files/lkml-20090407/boot3.log

(In case it isn't clear, we get messages twice because console and
debug_LL are on the same serial port. I also had to increase
CONFIG_LOG_BUF_SHIFT to get a complete log in dmesg.)

> Also, re those udev-induced messages:
>
> Remounting root file system...
> uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0
> Buffer I/O error on device mtdblock0, logical block 0
> uncorrectable error : <3>uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 8
> Buffer I/O error on device mtdblock0, logical block 1
> end_request: I/O error, dev mtdblock0, sector 16
> Buffer I/O error on device mtdblock0, logical block 2
> uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 24
> Buffer I/O error on device mtdblock0, logical block 3
> uncorrectable error : <3>end_request: I/O error, dev mtdblock0, sector 0
> Buffer I/O error on device mtdblock0, logical block 0
>
> You shouldn't need mtdblock unless you run JFSS2, so that's
> the quick way to get rid of them:  take that out of your
> kernel configuration.  Else, add "mtdblock*" to
> the "KERNEL==... ; goto persistent_storage_end" check in
>
>  /etc/udev/rules.d/60-persistent-storage.rules
>
> I understand the next version of udev will fix that.

Thanks for that. Unfortunately I am using JFFS2 for now - perhaps I
should switch to UBIFS.

-- Hugo
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hugo Vincent April 8, 2009, 1:22 a.m. UTC | #7
On Tue, Apr 7, 2009 at 4:19 PM, Hugo Vincent <hugo.vincent@gmail.com> wrote:
> On Tue, Apr 7, 2009 at 3:36 PM, David Brownell <david-b@pacbell.net> wrote:
>> On Monday 06 April 2009, Hugo Vincent wrote:
>>> Here is a complete boot log + config:
>>> http://hugovincent.com/files/lkml-20090407/boot2.log
>>
>> Erm, not very complete actually.  Enable DEBUG_LL to see
>> more early messages ... like the ones starting right
>> after the kernel decompression messages.
>
> How's this?
> http://hugovincent.com/files/lkml-20090407/boot3.log
>
> (In case it isn't clear, we get messages twice because console and
> debug_LL are on the same serial port. I also had to increase
> CONFIG_LOG_BUF_SHIFT to get a complete log in dmesg.)

Can anyone give me any pointers on where to start for fixing the
problems shown in the above boot log?

It looks like some fairly low level locking bugs (spinlock vs
raw_spinlock maybe?) in twl4030 IRQ handling and GP timer/clock event
source setup.

Hugo
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index 0d4f4c6..122b694 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -1073,6 +1073,7 @@  static void autoconfig(struct uart_8250_port *up, unsigned int probeflags)
 	unsigned char status1, scratch, scratch2, scratch3;
 	unsigned char save_lcr, save_mcr;
 	unsigned long flags;
+	DEFINE_RAW_SPINLOCK(raw_lock);
 
 	if (!up->port.iobase && !up->port.mapbase && !up->port.membase)
 		return;
@@ -1085,6 +1086,7 @@  static void autoconfig(struct uart_8250_port *up, unsigned int probeflags)
 	 * be frobbing the chips IRQ enable register to see if it exists.
 	 */
 	spin_lock_irqsave(&up->port.lock, flags);
+	spin_lock_irqsave(&raw_lock, flags);
 
 	up->capabilities = 0;
 	up->bugs = 0;
@@ -1240,6 +1242,7 @@  static void autoconfig(struct uart_8250_port *up, unsigned int probeflags)
 		serial_outp(up, UART_IER, 0);
 
  out:
+	spin_unlock_irqrestore(&raw_lock, flags);
 	spin_unlock_irqrestore(&up->port.lock, flags);
 	DEBUG_AUTOCONF("type=%s\n", uart_config[up->port.type].name);
 }