Message ID | alpine.DEB.2.10.1404041408280.8617@vroombuntu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Neil, On Fri, Apr 04, 2014 at 02:19:44PM +0100, Neil Greatorex wrote: > Thomas, > > After managing to get the card detected on my mirabox, I have been looking > into getting the thing actually working. With the information from Willy > and my own investigations I knew there was some issue in the setting up of > the MSIs for the card. Through adding a copious amount of printk()s I was > able to determine that the initialisation for the igb driver allocates, > frees and re-allocates 3 MSIs for the card. That sequence matches the memories I have about it indeed. > I noticed that in doing this > there was a problem whereby any call to free an MSI was trying to free > MSI#0. I was able to track this down to the fact that the mapping for the > IRQ was being disposed before the MSI was actually freed. The below patch > fixes this problem. > > With this patch, I can get one port on the card working. With both ports > enabled, I still get an OOPS, so some further work is needed. > > I would appreciate it if you could test this patch and let me know if you > find any problems. I don't know how to disable one port on my card (except by fiddling directly with its eeprom but I don't want to do this anymore, there's a high risk of bricking it). So I got the same panic as before with the two ports enabled. However I tried to revert the prefetchable memory patch that makes the NIC work for me and I noticed that your fix indeed improves things. After the oops, I no longer get : ---[ end trace f6c1769398756b02 ]--- trying to free unused MSI#0 trying to free unused MSI#0 trying to free unused MSI#0 trying to free unused MSI#0 trying to free unused MSI#0 trying to free unused MSI#0 as I used to see with the revert alone. So yes I think it's a nice step forward! Note that this was 3.13.9 on the XP-GP board (quad-core XP). Feel free to add my Tested-by if you like. Cheers, Willy
Dear Neil Greatorex, On Fri, 4 Apr 2014 14:19:44 +0100 (BST), Neil Greatorex wrote: > From 50aa11018059704229dd43ca1016defdda04f90c Mon Sep 17 00:00:00 2001 > From: Neil Greatorex <neil@fatboyfat.co.uk> > Date: Fri, 4 Apr 2014 13:47:09 +0100 > Subject: [PATCH] irqchip: armada-370-xp: Fix releasing of MSIs > > This patch moves the call to irq_dispose_mapping() to after the call to > armada_370_xp_free_msi(). Without this patch, the armada_370_xp_free_msi > function would always free MSI#0, no matter what was passed to it. > > Signed-off-by: <neil@fatboyfat.co.uk> > --- > drivers/irqchip/irq-armada-370-xp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c > index 5409564..f5e129e 100644 > --- a/drivers/irqchip/irq-armada-370-xp.c > +++ b/drivers/irqchip/irq-armada-370-xp.c > @@ -157,8 +157,8 @@ static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip, > unsigned int irq) > { > struct irq_data *d = irq_get_irq_data(irq); > - irq_dispose_mapping(irq); > armada_370_xp_free_msi(d->hwirq); > + irq_dispose_mapping(irq); > } > > static struct irq_chip armada_370_xp_msi_irq_chip = { I want to give it some test, but as it is, I'd prefer to have the irq_dispose_mapping() done before, and use a local variable to store d->hwirq to that armada_370_xp_free_msi() can be called after irq_dispose_mapping(). This is to ensure the sequence is symmetrical with the MSI setup sequence. Also, can you detail how you tested with just one port? Best regards, Thomas
Thomas, On Saturday, 5 April 2014 at 6:34pm, Thomas Petazzoni wrote: > Dear Neil Greatorex, > > On Fri, 4 Apr 2014 14:19:44 +0100 (BST), Neil Greatorex wrote: > > > From 50aa11018059704229dd43ca1016defdda04f90c Mon Sep 17 00:00:00 2001 > > From: Neil Greatorex <neil@fatboyfat.co.uk (mailto:neil@fatboyfat.co.uk)> > > Date: Fri, 4 Apr 2014 13:47:09 +0100 > > Subject: [PATCH] irqchip: armada-370-xp: Fix releasing of MSIs > > > > This patch moves the call to irq_dispose_mapping() to after the call to > > armada_370_xp_free_msi(). Without this patch, the armada_370_xp_free_msi > > function would always free MSI#0, no matter what was passed to it. > > > > Signed-off-by: <neil@fatboyfat.co.uk (mailto:neil@fatboyfat.co.uk)> > > --- > > drivers/irqchip/irq-armada-370-xp.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c > > index 5409564..f5e129e 100644 > > --- a/drivers/irqchip/irq-armada-370-xp.c > > +++ b/drivers/irqchip/irq-armada-370-xp.c > > @@ -157,8 +157,8 @@ static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip, > > unsigned int irq) > > { > > struct irq_data *d = irq_get_irq_data(irq); > > - irq_dispose_mapping(irq); > > armada_370_xp_free_msi(d->hwirq); > > + irq_dispose_mapping(irq); > > } > > > > static struct irq_chip armada_370_xp_msi_irq_chip = { > > I want to give it some test, but as it is, I'd prefer to have the > irq_dispose_mapping() done before, and use a local variable to store > d->hwirq to that armada_370_xp_free_msi() can be called after > irq_dispose_mapping(). This is to ensure the sequence is symmetrical > with the MSI setup sequence. I will redo the patch with a local variable tomorrow and resend it. Cheers, Neil
Hi Neil, On Fri, Apr 04, 2014 at 02:19:44PM +0100, Neil Greatorex wrote: > With this patch, I can get one port on the card working. With both ports > enabled, I still get an OOPS, so some further work is needed. Concerning this point, here's an update on my side. I found where the crash happens, but I don't exactly understand why, I suspect that an area is not correctly mapped : root@xpgp:~# insmod /tmp/igb.ko igb: Intel(R) Gigabit Ethernet Network Driver - version 5.0.5-k igb: Copyright (c) 2007-2013 Intel Corporation. PCI: enabling device 0000:00:09.0 (0140 -> 0143) PCI: enabling device 0000:02:00.0 (0140 -> 0142) request_region(pdev=edb21000, 00000049) hw_addr = pci_iomap(pdev=edb21000, 0, 0) = f0300000 mem_start=e0000000 mem_end=e007ffff hw_addr = f0300000 hw_addr=f0300000 igb 0000:02:00.0: added PHC on eth4 igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection igb 0000:02:00.0: eth4: (PCIe:5.0Gb/s:Width x1) 00:30:18:a6:6c:6a igb 0000:02:00.0: eth4: PBA No: FFFFFF-0FF igb 0000:02:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) PCI: enabling device 0000:02:00.1 (0140 -> 0142) request_region(pdev=ed99d800, 00000049) hw_addr = pci_iomap(pdev=ed99d800, 0, 0) = f0400000 mem_start=e0100000 mem_end=e017ffff hw_addr = f0400000 hw_addr=f0400000 Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0400018 Internal error: : 1008 [#1] SMP THUMB2 in e1000_82575.o:igb_get_invariants_82575() : 123c: f884 33b4 strb.w r3, [r4, #948] 1240: f884 33b9 strb.w r3, [r4, #953] 1244: f8c4 6300 str.w r6, [r4, #768] 1248: 6863 ldr r3, [r4, #4] => 124a: f8d3 8018 ldr.w r8, [r3, #24] in e1000_82575.c:igb_get_invariants_82575() : hw->phy.media_type = e1000_media_type_copper; dev_spec->sgmii_active = false; dev_spec->module_plugged = false; here=> ctrl_ext = rd32(E1000_CTRL_EXT); according to e1000_hw.h: #define E1000_CTRL_EXT 0x00018 /* Extended Device Control - RW */ according to e1000_regs.h: #define rd32(reg) (readl(hw->hw_addr + reg)) ===> ctrl_ext = readl((hw->hw_addr = 0xf0400000) + (reg = 0x18)) As you can see, the sequence is exactly the same for both ports. The first one has no problem performing the readl(), but the second one cannot. Both of them got the memory address returned by a call to pci_iomap(dev, 0, 0). I could verify that the pci_resource_len() for both is 524288 (0x80000). The last "hwaddr=f0400000" is printed just before calling rd32() and shows that it was still OK there. Since the resource flags are 0x40200, we only have IORESOURCE_MEM so pci_iomap() calls ioremap_nocache(). Thus I suspect something is not behaving correctly in the code which configures the emulated bridge and/or the memory areas, resulting in the second port not being correctly mapped, thus causing the crash. But that's the deepest I can go unfortunately, I got lost after that. Regards, Willy
Dear Willy Tarreau, On Sun, 6 Apr 2014 20:58:33 +0200, Willy Tarreau wrote: > On Fri, Apr 04, 2014 at 02:19:44PM +0100, Neil Greatorex wrote: > > With this patch, I can get one port on the card working. With both ports > > enabled, I still get an OOPS, so some further work is needed. > > Concerning this point, here's an update on my side. I found where the crash > happens, but I don't exactly understand why, I suspect that an area is not > correctly mapped : > > root@xpgp:~# insmod /tmp/igb.ko > igb: Intel(R) Gigabit Ethernet Network Driver - version 5.0.5-k > igb: Copyright (c) 2007-2013 Intel Corporation. > PCI: enabling device 0000:00:09.0 (0140 -> 0143) > PCI: enabling device 0000:02:00.0 (0140 -> 0142) > request_region(pdev=edb21000, 00000049) > hw_addr = pci_iomap(pdev=edb21000, 0, 0) = f0300000 > mem_start=e0000000 mem_end=e007ffff > hw_addr = f0300000 > hw_addr=f0300000 > igb 0000:02:00.0: added PHC on eth4 > igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection > igb 0000:02:00.0: eth4: (PCIe:5.0Gb/s:Width x1) 00:30:18:a6:6c:6a > igb 0000:02:00.0: eth4: PBA No: FFFFFF-0FF > igb 0000:02:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) > PCI: enabling device 0000:02:00.1 (0140 -> 0142) > request_region(pdev=ed99d800, 00000049) > hw_addr = pci_iomap(pdev=ed99d800, 0, 0) = f0400000 > mem_start=e0100000 mem_end=e017ffff > hw_addr = f0400000 > hw_addr=f0400000 > Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0400018 > Internal error: : 1008 [#1] SMP THUMB2 > > in e1000_82575.o:igb_get_invariants_82575() : > 123c: f884 33b4 strb.w r3, [r4, #948] > 1240: f884 33b9 strb.w r3, [r4, #953] > 1244: f8c4 6300 str.w r6, [r4, #768] > 1248: 6863 ldr r3, [r4, #4] > => 124a: f8d3 8018 ldr.w r8, [r3, #24] > > in e1000_82575.c:igb_get_invariants_82575() : > hw->phy.media_type = e1000_media_type_copper; > dev_spec->sgmii_active = false; > dev_spec->module_plugged = false; > > here=> ctrl_ext = rd32(E1000_CTRL_EXT); > > according to e1000_hw.h: > > #define E1000_CTRL_EXT 0x00018 /* Extended Device Control - RW */ > > according to e1000_regs.h: > > #define rd32(reg) (readl(hw->hw_addr + reg)) > > ===> ctrl_ext = readl((hw->hw_addr = 0xf0400000) + (reg = 0x18)) > > As you can see, the sequence is exactly the same for both ports. The > first one has no problem performing the readl(), but the second one > cannot. Both of them got the memory address returned by a call to > pci_iomap(dev, 0, 0). I could verify that the pci_resource_len() for > both is 524288 (0x80000). > > The last "hwaddr=f0400000" is printed just before calling rd32() and > shows that it was still OK there. Since the resource flags are 0x40200, > we only have IORESOURCE_MEM so pci_iomap() calls ioremap_nocache(). > > Thus I suspect something is not behaving correctly in the code which > configures the emulated bridge and/or the memory areas, resulting in > the second port not being correctly mapped, thus causing the crash. > > But that's the deepest I can go unfortunately, I got lost after that. Thanks a lot again for all this investigation. I'm hoping to be able to look at this PCIe issue this week. Thomas
Willy, On Sun, 6 Apr 2014, Willy Tarreau wrote: > As you can see, the sequence is exactly the same for both ports. The > first one has no problem performing the readl(), but the second one > cannot. Both of them got the memory address returned by a call to > pci_iomap(dev, 0, 0). I could verify that the pci_resource_len() for > both is 524288 (0x80000). > > The last "hwaddr=f0400000" is printed just before calling rd32() and > shows that it was still OK there. Since the resource flags are 0x40200, > we only have IORESOURCE_MEM so pci_iomap() calls ioremap_nocache(). > Good work - I've been doing similar things myself! I can confirm that I see exactly the same thing with similar printks: First port: [ 1809.452878] igb 0000:01:00.0: enabling bus mastering [ 1809.453098] igb 0000:01:00.0 (unregistered net_device): hw_addr is f1000000, start=e0000000, len=80000, flags=40200 [ 1809.453109] igb 0000:01:00.0 (unregistered net_device): About to read from offset 18 [ 1809.453120] igb 0000:01:00.0 (unregistered net_device): Read from 18 returned 1400c0 Second port: [ 1809.459445] igb 0000:01:00.1: enabling bus mastering [ 1809.459563] igb 0000:01:00.1 (unregistered net_device): hw_addr is f1100000, start=e0100000, len=80000, flags=40200 [ 1809.459573] igb 0000:01:00.1 (unregistered net_device): About to read from offset 18 [ 1809.459581] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf1100018 In the output above, the start= part shows the physical address and hw_addr shows the mapped address. The physical addresses match those given in the lspci -vvv output (see https://gist.github.com/ngreatorex/9772195). I don't know enough about PCIe, the SoC *or* the Intel card to know if these addresses look correct or even sane! I did wonder if there was some issue due to the fact that the resources for 01:00.0 and 01:00.1 overlap, but I would guess(!?) that it's common in hardware that presents multiple devices. It is perhaps noteworthy that this is the first access to the hardware for the 2nd port - i.e. there are no successful accesses, other than to enable the hardware, which AFAICT is simply accessing registers on the PCIe controller. > Thus I suspect something is not behaving correctly in the code which > configures the emulated bridge and/or the memory areas, resulting in > the second port not being correctly mapped, thus causing the crash. > > But that's the deepest I can go unfortunately, I got lost after that. > Same here. I've tried playing around with a few things but not discovered anything even close to useful. Hopefully Thomas will be able to debug further when he gets the time. > Regards, > Willy > > Cheers, Neil
On Sun, Apr 06, 2014 at 10:57:40PM +0100, Neil Greatorex wrote: > Good work - I've been doing similar things myself! I can confirm that I > see exactly the same thing with similar printks: Great! A reproduceable bug is always half-resolved :-) > The physical addresses match those given in the lspci -vvv output same here. > (see > https://gist.github.com/ngreatorex/9772195). I don't know enough about > PCIe, the SoC *or* the Intel card to know if these addresses look correct > or even sane! I did wonder if there was some issue due to the fact that > the resources for 01:00.0 and 01:00.1 overlap, but I would guess(!?) that > it's common in hardware that presents multiple devices. My understanding and old memories tell me that's OK. > It is perhaps noteworthy that this is the first access to the hardware for > the 2nd port - i.e. there are no successful accesses, other than to enable > the hardware, which AFAICT is simply accessing registers on the PCIe > controller. Indeed, I forgot to mention that but you're right, enabling only the second function leads to an immediate panic as well. > I've tried playing around with a few things but not discovered > anything even close to useful. Hopefully Thomas will be able to debug > further when he gets the time. Yeah, let's hope we have not opened a can of worms! Willy
Dear Neil Greatorex, On Sun, 6 Apr 2014 22:57:40 +0100 (BST), Neil Greatorex wrote: > Second port: > [ 1809.459445] igb 0000:01:00.1: enabling bus mastering > [ 1809.459563] igb 0000:01:00.1 (unregistered net_device): hw_addr is > f1100000, start=e0100000, len=80000, flags=40200 > [ 1809.459573] igb 0000:01:00.1 (unregistered net_device): About to read > from offset 18 > [ 1809.459581] Unhandled fault: external abort on non-linefetch (0x1008) > at 0xf1100018 This address is a virtual address, so there's not much we can do with it, as it is. > The physical addresses match those given in the lspci -vvv output (see > https://gist.github.com/ngreatorex/9772195). I don't know enough about > PCIe, the SoC *or* the Intel card to know if these addresses look correct > or even sane! I did wonder if there was some issue due to the fact that > the resources for 01:00.0 and 01:00.1 overlap, but I would guess(!?) that > it's common in hardware that presents multiple devices. > > It is perhaps noteworthy that this is the first access to the hardware for > the 2nd port - i.e. there are no successful accesses, other than to enable > the hardware, which AFAICT is simply accessing registers on the PCIe > controller. It really looks like the MBus window has not been sized correctly, or there is a missing MBus window. Probably a deficiency in the PCI bridge emulation layer in pci-mvebu. Tomorrow, I have a bit of work on AHCI on Armada 38x, but I'm hoping to get to this after that. Thanks again for all the investigation. All your research is clearly going to make the debugging a *lot* easier. Best regards, Thomas
Thomas, On Sun, Apr 6, 2014 at 11:16 PM, Thomas Petazzoni <thomas.petazzoni@free-electrons.com> wrote: > Dear Neil Greatorex, > > On Sun, 6 Apr 2014 22:57:40 +0100 (BST), Neil Greatorex wrote: > >> Second port: >> [ 1809.459445] igb 0000:01:00.1: enabling bus mastering >> [ 1809.459563] igb 0000:01:00.1 (unregistered net_device): hw_addr is >> f1100000, start=e0100000, len=80000, flags=40200 >> [ 1809.459573] igb 0000:01:00.1 (unregistered net_device): About to read >> from offset 18 >> [ 1809.459581] Unhandled fault: external abort on non-linefetch (0x1008) >> at 0xf1100018 > > This address is a virtual address, so there's not much we can do with > it, as it is. > The address in the fault message is, but the physical address can be worked out from the 2 lines above. "start=e0100000" and "offset 18". Adding those together we get the physical address 0xe100018. >> The physical addresses match those given in the lspci -vvv output (see >> https://gist.github.com/ngreatorex/9772195). I don't know enough about >> PCIe, the SoC *or* the Intel card to know if these addresses look correct >> or even sane! I did wonder if there was some issue due to the fact that >> the resources for 01:00.0 and 01:00.1 overlap, but I would guess(!?) that >> it's common in hardware that presents multiple devices. >> >> It is perhaps noteworthy that this is the first access to the hardware for >> the 2nd port - i.e. there are no successful accesses, other than to enable >> the hardware, which AFAICT is simply accessing registers on the PCIe >> controller. > > It really looks like the MBus window has not been sized correctly, or > there is a missing MBus window. Probably a deficiency in the PCI bridge > emulation layer in pci-mvebu. > OK well that gives me another direction to point my debugging attempts. If I discover anything interesting I'll let you know. > Tomorrow, I have a bit of work on AHCI on Armada 38x, but I'm hoping to > get to this after that. Excellent news. > > Thanks again for all the investigation. All your research is clearly > going to make the debugging a *lot* easier. No problem. I'm always interested in trying to learn more about the hardware I have and how it all works. Of course it would be much easier if Marvell would publicly release the datasheet but we can't have everything! Cheers, Neil
diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c index 5409564..f5e129e 100644 --- a/drivers/irqchip/irq-armada-370-xp.c +++ b/drivers/irqchip/irq-armada-370-xp.c @@ -157,8 +157,8 @@ static void armada_370_xp_teardown_msi_irq(struct msi_chip *chip, unsigned int irq) { struct irq_data *d = irq_get_irq_data(irq); - irq_dispose_mapping(irq); armada_370_xp_free_msi(d->hwirq); + irq_dispose_mapping(irq); } static struct irq_chip armada_370_xp_msi_irq_chip = {