Message ID | 49C9BBD7.4040705@jp.fujitsu.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
* Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>: > Alex Chiang wrote: > > * Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>: > >> I still have the following kernel error messages in testing with your > >> latest set of patches (Jesse's linux-next). The test case is removing > >> e1000e device or its parent bridge by "echo 1 > /sys/bus/pci/devices/ > >> .../remove". > >> > >> [ 537.379995] ============================================= > >> [ 537.380124] [ INFO: possible recursive locking detected ] > >> [ 537.380128] 2.6.29-rc8-kk #1 > >> [ 537.380128] --------------------------------------------- > >> [ 537.380128] events/4/56 is trying to acquire lock: > >> [ 537.380128] (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0 > >> [ 537.380128] > >> [ 537.380128] but task is already holding lock: > >> [ 537.380128] (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 > >> [ 537.380128] > >> [ 537.380128] other info that might help us debug this: > >> [ 537.380128] 3 locks held by events/4/56: > >> [ 537.380128] #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 > >> [ 537.380128] #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 > >> [ 537.380128] #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40 > > > > I still cannot reproduce this lockdep issue, even using your > > .config with an e1000e device on an x86_64 kernel. :( > > > > I tried removing the endpoint, an intermediate bridge device, and > > the parent bus. I don't know what I'm doing wrong... > > > > I don't know either... > The reproducibility is 100% on my environment. The steps are > just boot the system and remove the device. > > > Can you please try this patch though, and see if it fixes the > > warning? It applies on top of my other sysfs patch that > > introduces a mutex in sysfs_schedule_callback. > > Anyway, I confirmed the kernel error messages were gone with > the patch against sysfs. Note that I used the following patch > I made for testing instead since your patch could not be > applied to Jesse's linux-next. Great, thank you for testing Kenji-san. /ac -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Alex Chiang wrote: > * Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>: >> Alex Chiang wrote: >>> * Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>: >>>> I still have the following kernel error messages in testing with your >>>> latest set of patches (Jesse's linux-next). The test case is removing >>>> e1000e device or its parent bridge by "echo 1 > /sys/bus/pci/devices/ >>>> .../remove". >>>> >>>> [ 537.379995] ============================================= >>>> [ 537.380124] [ INFO: possible recursive locking detected ] >>>> [ 537.380128] 2.6.29-rc8-kk #1 >>>> [ 537.380128] --------------------------------------------- >>>> [ 537.380128] events/4/56 is trying to acquire lock: >>>> [ 537.380128] (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0 >>>> [ 537.380128] >>>> [ 537.380128] but task is already holding lock: >>>> [ 537.380128] (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 >>>> [ 537.380128] >>>> [ 537.380128] other info that might help us debug this: >>>> [ 537.380128] 3 locks held by events/4/56: >>>> [ 537.380128] #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 >>>> [ 537.380128] #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 >>>> [ 537.380128] #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40 >>> I still cannot reproduce this lockdep issue, even using your >>> .config with an e1000e device on an x86_64 kernel. :( >>> >>> I tried removing the endpoint, an intermediate bridge device, and >>> the parent bus. I don't know what I'm doing wrong... >>> >> I don't know either... >> The reproducibility is 100% on my environment. The steps are >> just boot the system and remove the device. >> >>> Can you please try this patch though, and see if it fixes the >>> warning? It applies on top of my other sysfs patch that >>> introduces a mutex in sysfs_schedule_callback. >> Anyway, I confirmed the kernel error messages were gone with >> the patch against sysfs. Note that I used the following patch >> I made for testing instead since your patch could not be >> applied to Jesse's linux-next. > > Great, thank you for testing Kenji-san. > You're welcome. Just in case, my patch is just for testing, and it is very buggy (no destroy operation, lack of module_put() in error code path, and so on). Please consider it as just for testing. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-next-20090323/fs/sysfs/file.c =================================================================== --- linux-next-20090323.orig/fs/sysfs/file.c 2009-03-25 12:09:37.000000000 +0900 +++ linux-next-20090323/fs/sysfs/file.c 2009-03-25 13:40:10.000000000 +0900 @@ -677,6 +677,7 @@ kfree(ss); } +static struct workqueue_struct *sysfsd_wq; /** * sysfs_schedule_callback - helper to schedule a callback for a kobject * @kobj: object we're acting for. @@ -704,6 +705,17 @@ if (!try_module_get(owner)) return -ENODEV; + + if (!sysfsd_wq) { + sysfsd_wq = create_workqueue("sysfsd"); + if (!sysfsd_wq) { + printk(KERN_ERR + "%s: Could not create workqueue\n", __func__); + WARN_ON(1); + return -ENOMEM; + } + } + ss = kmalloc(sizeof(*ss), GFP_KERNEL); if (!ss) { module_put(owner); @@ -715,7 +727,7 @@ ss->data = data; ss->owner = owner; INIT_WORK(&ss->work, sysfs_schedule_callback_work); - schedule_work(&ss->work); + queue_work(sysfsd_wq, &ss->work); return 0; } EXPORT_SYMBOL_GPL(sysfs_schedule_callback);