Message ID | 4973EF64.2050404@jp.fujitsu.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
James Bottomley wrote: > On Mon, 2009-01-19 at 12:11 +0900, Kenji Kaneshige wrote: >> I'm sending the debug patch against 2.6.29-rc1 below. I'm also >> sending it as an attachment. It also contains the code to prevent >> the kernel panic from you. Please note that you will see two >> WARN_ON(1) messages with the patch. Those are shown by my debug >> patch. >> >> Could you try it and send me the whole dmsg output? > > Actually, the machine is now running -rc2, so the patch doesn't quite > apply: > > jejb@sparkweed> patch -p1 < ~/tmp.diff > patching file include/linux/pci-acpi.h > patching file drivers/pci/hotplug/acpiphp_glue.c > Hunk #1 FAILED at 266. > Hunk #2 succeeded at 467 (offset 2 lines). > Hunk #3 succeeded at 489 (offset 2 lines). > Hunk #4 succeeded at 518 (offset 2 lines). > Hunk #5 succeeded at 556 (offset 2 lines). > > Looks like the rejection is in the NULL check piece, so I've already > actually sent you the information it would produce. > > The dmesg output is pretty big, below. > Thank you very much, James. Thanks to you, I found the root cause of the problem. The acpi_pci_get_bridge_handle() function assumes pci_bus->self is NULL on the root bus. But it is not true and pci_bus->self can have a non-NULL value on some platfroms (like yours). So it must check pci_bus->parent instead. I found some other code that has the same wrong assumption. I'll make a fix for them and send it soon. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kenji Kaneshige wrote: > James Bottomley wrote: >> On Mon, 2009-01-19 at 12:11 +0900, Kenji Kaneshige wrote: >>> I'm sending the debug patch against 2.6.29-rc1 below. I'm also >>> sending it as an attachment. It also contains the code to prevent >>> the kernel panic from you. Please note that you will see two >>> WARN_ON(1) messages with the patch. Those are shown by my debug >>> patch. >>> >>> Could you try it and send me the whole dmsg output? >> Actually, the machine is now running -rc2, so the patch doesn't quite >> apply: >> >> jejb@sparkweed> patch -p1 < ~/tmp.diff >> patching file include/linux/pci-acpi.h >> patching file drivers/pci/hotplug/acpiphp_glue.c >> Hunk #1 FAILED at 266. >> Hunk #2 succeeded at 467 (offset 2 lines). >> Hunk #3 succeeded at 489 (offset 2 lines). >> Hunk #4 succeeded at 518 (offset 2 lines). >> Hunk #5 succeeded at 556 (offset 2 lines). >> >> Looks like the rejection is in the NULL check piece, so I've already >> actually sent you the information it would produce. >> >> The dmesg output is pretty big, below. >> > > Thank you very much, James. > > Thanks to you, I found the root cause of the > problem. The acpi_pci_get_bridge_handle() function assumes > pci_bus->self is NULL on the root bus. But it is not true > and pci_bus->self can have a non-NULL value on some > platfroms (like yours). So it must check pci_bus->parent > instead. > > I found some other code that has the same wrong assumption. > I'll make a fix for them and send it soon. > I made several patches to fix the wrong assumption described above. It is very difficult for me to check all the code that refers pci_bus->self. So I checked include/linux/pci-acpi.h and the code under drivers/pci/ only. And I made patches for the code like below: - The code that clearly chooses host bridge operation or PCI-to-PCI bridge operation based on pci_bus->self. - The code that might cause endless loop if pci_bus->self is not NULL on the PCI root bus. The patches are - [PATCH 1/8] PCI/ACPI: fix wrong assumption in acpi_pci_get_bridge_handle - [PATCH 2/8] PCI/ACPI: fix wrong assumption in acpi_find_root_bridge_handle - [PATCH 3/8] PCI hotplug: fix wrong assumption in acpi_get_hp_params_from_firmware - [PATCH 4/8] PCI hotplug: fix wrong assumption in acpi_get_hp_hw_control_from_firmware - [PATCH 5/8] PCI: fix wrong assumption in pci_find_upstream_pcie_bridge - [PATCH 6/8] PCI: fix wrong assumption in pci_read_bridge_bases - [PATCH 7/8] PCI: fix wrong assumption in pci_get_interrupt_pin - [PATCH 8/8] PCI: fix wrong assumption in pci_common_swizzle Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kenji Kaneshige wrote: > Kenji Kaneshige wrote: >> James Bottomley wrote: >>> On Mon, 2009-01-19 at 12:11 +0900, Kenji Kaneshige wrote: >>>> I'm sending the debug patch against 2.6.29-rc1 below. I'm also >>>> sending it as an attachment. It also contains the code to prevent >>>> the kernel panic from you. Please note that you will see two >>>> WARN_ON(1) messages with the patch. Those are shown by my debug >>>> patch. >>>> >>>> Could you try it and send me the whole dmsg output? >>> Actually, the machine is now running -rc2, so the patch doesn't quite >>> apply: >>> >>> jejb@sparkweed> patch -p1 < ~/tmp.diff >>> patching file include/linux/pci-acpi.h >>> patching file drivers/pci/hotplug/acpiphp_glue.c >>> Hunk #1 FAILED at 266. >>> Hunk #2 succeeded at 467 (offset 2 lines). >>> Hunk #3 succeeded at 489 (offset 2 lines). >>> Hunk #4 succeeded at 518 (offset 2 lines). >>> Hunk #5 succeeded at 556 (offset 2 lines). >>> >>> Looks like the rejection is in the NULL check piece, so I've already >>> actually sent you the information it would produce. >>> >>> The dmesg output is pretty big, below. >>> >> Thank you very much, James. >> >> Thanks to you, I found the root cause of the >> problem. The acpi_pci_get_bridge_handle() function assumes >> pci_bus->self is NULL on the root bus. But it is not true >> and pci_bus->self can have a non-NULL value on some >> platfroms (like yours). So it must check pci_bus->parent >> instead. >> >> I found some other code that has the same wrong assumption. >> I'll make a fix for them and send it soon. >> > > I made several patches to fix the wrong assumption described > above. It is very difficult for me to check all the code that > refers pci_bus->self. So I checked include/linux/pci-acpi.h and > the code under drivers/pci/ only. And I made patches for the code > like below: > > - The code that clearly chooses host bridge operation or > PCI-to-PCI bridge operation based on pci_bus->self. > > - The code that might cause endless loop if pci_bus->self is > not NULL on the PCI root bus. > > The patches are > > - [PATCH 1/8] PCI/ACPI: fix wrong assumption in acpi_pci_get_bridge_handle > - [PATCH 2/8] PCI/ACPI: fix wrong assumption in acpi_find_root_bridge_handle > - [PATCH 3/8] PCI hotplug: fix wrong assumption in acpi_get_hp_params_from_firmware > - [PATCH 4/8] PCI hotplug: fix wrong assumption in acpi_get_hp_hw_control_from_firmware > - [PATCH 5/8] PCI: fix wrong assumption in pci_find_upstream_pcie_bridge > - [PATCH 6/8] PCI: fix wrong assumption in pci_read_bridge_bases > - [PATCH 7/8] PCI: fix wrong assumption in pci_get_interrupt_pin > - [PATCH 8/8] PCI: fix wrong assumption in pci_common_swizzle > Those patches are against 2.6.29-rc2. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, January 28, 2009 3:47 am Kenji Kaneshige wrote: > I made several patches to fix the wrong assumption described > above. It is very difficult for me to check all the code that > refers pci_bus->self. So I checked include/linux/pci-acpi.h and > the code under drivers/pci/ only. And I made patches for the code > like below: > > - The code that clearly chooses host bridge operation or > PCI-to-PCI bridge operation based on pci_bus->self. > > - The code that might cause endless loop if pci_bus->self is > not NULL on the PCI root bus. Kenji-san, can you re-send these to my private mail so I can apply them? (Outlook/Exchange can't let plain text through without molesting it for some reason.) Thanks,
Index: linux-2.6.29-rc1/include/linux/pci-acpi.h =================================================================== --- linux-2.6.29-rc1.orig/include/linux/pci-acpi.h 2008-12-01 22:59:52.000000000 +0900 +++ linux-2.6.29-rc1/include/linux/pci-acpi.h 2009-01-19 10:27:13.000000000 +0900 @@ -65,9 +65,28 @@ { int seg = pci_domain_nr(pbus), busnr = pbus->number; struct pci_dev *bridge = pbus->self; - if (bridge) - return DEVICE_ACPI_HANDLE(&(bridge->dev)); - return acpi_get_pci_rootbridge_handle(seg, busnr); + acpi_handle handle; + char objname[64] = "<NULL>"; + struct acpi_buffer buffer = { .length = sizeof(objname), + .pointer = objname }; + + printk(KERN_INFO "%s: bridge(pbus->self) = %p, pbus->parent = %p\n", + __func__, bridge, pbus->parent); + + if (bridge) { + printk(KERN_INFO "%s: handle the bridge as PtoP.\n", __func__); + handle = DEVICE_ACPI_HANDLE(&(bridge->dev)); + } else { + printk(KERN_INFO "%s: handle the bridge as root.\n", __func__); + handle = acpi_get_pci_rootbridge_handle(seg, busnr); + } + + if (handle) + acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer); + printk(KERN_INFO "%s: ACPI handle of the bridge for %04x:%02x: %p %s\n", + __func__, seg, busnr, handle, objname); + + return handle; } #else #if !defined(AE_ERROR) Index: linux-2.6.29-rc1/drivers/pci/hotplug/acpiphp_glue.c =================================================================== --- linux-2.6.29-rc1.orig/drivers/pci/hotplug/acpiphp_glue.c 2008-12-01 22:59:46.000000000 +0900 +++ linux-2.6.29-rc1/drivers/pci/hotplug/acpiphp_glue.c 2009-01-19 10:32:05.000000000 +0900 @@ -266,6 +266,12 @@ int found = acpi_pci_detect_ejectable(pbus); if (!found) { acpi_handle bridge_handle = acpi_pci_get_bridge_handle(pbus); + if (!bridge_handle) { + printk(KERN_INFO + "%s: NULL handle returned!!!\n", __func__); + WARN_ON(1); + return 0; + } acpi_walk_namespace(ACPI_TYPE_DEVICE, bridge_handle, (u32)1, is_pci_dock_device, (void *)&found, NULL); } @@ -459,6 +465,9 @@ int device, function; struct pci_dev *dev; struct pci_bus *pci_bus = context; + char objname[64]; + struct acpi_buffer buffer = { .length = sizeof(objname), + .pointer = objname }; status = acpi_get_handle(handle, "_ADR", &dummy_handle); if (ACPI_FAILURE(status)) @@ -478,6 +487,9 @@ if (!dev || !dev->subordinate) goto out; + acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer); + printk(KERN_INFO "%s: Detecting slots under PtoP bridge (%s)\n", + __func__, objname); /* check if this bridge has ejectable slots */ if ((detect_ejectable_slots(dev->subordinate) > 0)) { dbg("found PCI-to-PCI bridge at PCI %s\n", pci_name(dev)); @@ -504,6 +516,9 @@ int seg, bus; acpi_handle dummy_handle; struct pci_bus *pci_bus; + char objname[64]; + struct acpi_buffer buffer = { .length = sizeof(objname), + .pointer = objname }; /* if the bridge doesn't have _STA, we assume it is always there */ status = acpi_get_handle(handle, "_STA", &dummy_handle); @@ -539,6 +554,9 @@ return 0; } + acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer); + printk(KERN_INFO "%s: Detecting slots under root bridge (%s)\n", + __func__, objname); /* check if this bridge has ejectable slots */ if (detect_ejectable_slots(pci_bus) > 0) { dbg("found PCI host-bus bridge with hot-pluggable slots\n");