From patchwork Thu Feb 28 11:50:44 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yijing Wang X-Patchwork-Id: 2197151 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: patchwork-linux-pci@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 3A28C3FCF6 for ; Thu, 28 Feb 2013 11:51:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752742Ab3B1LvY (ORCPT ); Thu, 28 Feb 2013 06:51:24 -0500 Received: from szxga01-in.huawei.com ([119.145.14.64]:61336 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752723Ab3B1LvX (ORCPT ); Thu, 28 Feb 2013 06:51:23 -0500 Received: from 172.24.2.119 (EHLO szxeml207-edg.china.huawei.com) ([172.24.2.119]) by szxrg01-dlp.huawei.com (MOS 4.3.4-GA FastPath queued) with ESMTP id AYG82752; Thu, 28 Feb 2013 19:50:55 +0800 (CST) Received: from SZXEML458-HUB.china.huawei.com (10.82.67.201) by szxeml207-edg.china.huawei.com (172.24.2.56) with Microsoft SMTP Server (TLS) id 14.1.323.7; Thu, 28 Feb 2013 19:50:51 +0800 Received: from [127.0.0.1] (10.135.76.69) by SZXEML458-HUB.china.huawei.com (10.82.67.201) with Microsoft SMTP Server id 14.1.323.7; Thu, 28 Feb 2013 19:50:48 +0800 Message-ID: <512F4494.5050301@huawei.com> Date: Thu, 28 Feb 2013 19:50:44 +0800 From: Yijing Wang User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Gu Zheng CC: Yinghai Lu , Myron Stowe , Bjorn Helgaas , Joe Lawrence , , Matthew Garrett , Myron Stowe , David Bulkow Subject: Re: [PATCH 1/2] PCI: ASPM exit link state code could skip devices References: <51122BED.8090308@cn.fujitsu.com> <5122F276.80807@cn.fujitsu.com> <512AFDB7.5030105@cn.fujitsu.com> <512DAAC9.8030400@cn.fujitsu.com> <512F35D3.6080009@cn.fujitsu.com> In-Reply-To: <512F35D3.6080009@cn.fujitsu.com> X-Originating-IP: [10.135.76.69] X-CFilter-Loop: Reflected Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On 2013/2/28 18:47, Gu Zheng wrote: > On 02/27/2013 02:47 PM, Yinghai Lu wrote: > >> On Tue, Feb 26, 2013 at 10:42 PM, Gu Zheng wrote: >>> I just agree with Bjorn's analysis. And I have test Yinghai's patch on kernel 3.8 >>> , but it seems does not work. More infos, please refer to bugzilla: >>> https://bugzilla.kernel.org/show_bug.cgi?id=54411 >> >> you need to test that on linus's tree of 2013-02-26. >> or v3.9-rc1 > > Hi Yinghai, > I test your patch on linus' tree of 2-26 > commit d895cb1af15c04c522a25c79cc429076987c089b > But it still does not work~ I found another problem when doing device remove by /sys/..../$device/remove and acpi hotplug. Because remove_callback() function was called in workqueue. The device which was hold by remove_callback() may be removed by other interfaces like acpiphp/pciehp, upstream device remove.... So once remove_callback() try to remove this device again(which was removed), system may panic. panic info found in my machine: kworker/u:3[273]: Oops 11003706212352 [1] Modules linked in: raw snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device nfsv3 nf s_acl iptable_filter ip_tables x_tables nfs fscache dns_resolver lockd sunrpc cp ufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandler dm_mod snd_hda_code c_hdmi snd_hda_intel igb snd_hda_codec snd_hwdep snd_pcm snd_timer iTCO_wdt iTCO _vendor_support snd ppdev soundcore serio_raw lpc_ich mfd_core snd_page_alloc sg ehci_pci mptctl ptp pps_core i2c_i801 parport_pc i2c_core hid_generic parport c ontainer button usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10di f ext3 mbcache jbd fan processor ide_pci_generic ide_core mptsas mptscsih mptbas e scsi_transport_sas ata_piix libata scsi_mod thermal thermal_sys hwmon Pid: 273, CPU 29, comm: kworker/u:3 psr : 0000121008526038 ifs : 8000000000000307 ip : [] Tain ted: G B (3.8.0-rc2-pci-bind) ip is at pci_destroy_dev+0x61/0x160 unat: 0000000000000000 pfs : 0000000000000307 rsc : 0000000000000003 rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000018000019585 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c9e70433f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000001004d3df0 b6 : a0000001004c92a0 b7 : a00000010000b4e0 f6 : 000000000000000000000 f7 : 1003e00000018ac0017c7 f8 : 1003e0044b82fa09b5a53 f9 : 1003e00002779e56ddcba f10 : 1003e17b2cb67d049962e f11 : 1003e0000000000000c56 r1 : a0000001015ae780 r2 : 0000000000100100 r3 : 0000000000100108 r8 : a0000001013af748 r9 : 0000000000000000 r10 : 0000000000200201 r11 : 000000000000d5a4 r12 : e0000007059afdd0 r13 : e0000007059a0000 r14 : 0000000000200200 r15 : 0000000000200200 r16 : 0000000000100100 r17 : e00000170353da88 r18 : e000001f03503e80 r19 : e00000170353da90 r20 : 0000000000000000 r21 : 0000000000000000 r22 : a0000001013cc608 r23 : 0000000000000063 r24 : 000000000000006b r25 : 000000000000006c r26 : 000000000000006f r27 : a000000101a82cc0 r28 : 0000000000000000 r29 : 0000000000000000 r30 : 000000000000d5a2 r31 : 000000000000d5a2 Call Trace: [] show_stack+0x80/0xa0 sp=e0000007059af990 bsp=e0000007059a1400 [] show_regs+0x640/0x920 sp=e0000007059afb60 bsp=e0000007059a13a0 [] die+0x190/0x2c0 sp=e0000007059afb70 bsp=e0000007059a1360 [] ia64_do_page_fault+0xbd0/0xc00 sp=e0000007059afb70 bsp=e0000007059a12d0 [] ia64_native_leave_kernel+0x0/0x270 sp=e0000007059afc00 bsp=e0000007059a12d0 [] pci_destroy_dev+0x60/0x160 sp=e0000007059afdd0 bsp=e0000007059a1298 [] pci_remove_bus_device+0xc0/0xe0 sp=e0000007059afdd0 bsp=e0000007059a1258 [] pci_stop_and_remove_bus_device+0x30/0x60 sp=e0000007059afdd0 bsp=e0000007059a1238 [] remove_callback+0xf0/0x1c0 sp=e0000007059afdd0 bsp=e0000007059a1208 [] sysfs_schedule_callback_work+0x50/0x120 sp=e0000007059afdd0 bsp=e0000007059a11d0 [] process_one_work+0x520/0xa80 sp=e0000007059afdd0 bsp=e0000007059a1140 [] worker_thread+0x330/0xde0 sp=e0000007059afdd0 bsp=e0000007059a1070 [] kthread+0x150/0x180 sp=e0000007059afdd0 bsp=e0000007059a1038 [] call_payload+0x50/0x80 sp=e0000007059afe30 bsp=e0000007059a1020 Unable to handle kernel NULL pointer dereference (address 0000000000000038) kworker/u:3[273]: Oops 8813272891392 [2] Modules linked in: raw snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device nfsv3 nf s_acl iptable_filter ip_tables x_tables nfs fscache dns_resolver lockd sunrpc cp ufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si ipmi_devintf ipmi_msghandler dm_mod snd_hda_code c_hdmi snd_hda_intel igb snd_hda_codec snd_hwdep snd_pcm snd_timer iTCO_wdt iTCO _vendor_support snd ppdev soundcore serio_raw lpc_ich mfd_core snd_page_alloc sg ehci_pci mptctl ptp pps_core i2c_i801 parport_pc i2c_core hid_generic parport c ontainer button usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10di f ext3 mbcache jbd fan processor ide_pci_generic ide_core mptsas mptscsih mptbas e scsi_transport_sas ata_piix libata scsi_mod thermal thermal_sys hwmon Pid: 273, CPU 29, comm: kworker/u:3 psr : 0000101008022038 ifs : 8000000000000309 ip : [] Tain ted: G B D (3.8.0-rc2-pci-bind) ip is at wq_worker_sleeping+0x30/0x180 unat: 0000000000000000 pfs : 0000000000000309 rsc : 0000000000000003 rnat: 000000000000040e bsps: 0000000000000003 pr : 000565501552a5d5 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000001000c21a0 b6 : a0000001000fdc80 b7 : a0000001000ffbe0 f6 : 0ffefaec33e1f63409a90 f7 : 0fff1ed2d4e22a0000000 f8 : 10017a916000000000000 f9 : 1000ebb80000000000000 f10 : 10007e6dbd1941e705b2d f11 : 1003e00000000000001cd r1 : a0000001015ae780 r2 : 0000000000000000 r3 : 0000000000000038 r8 : 0000000000000000 r9 : 0000000000000000 r10 : e000001800206280 r11 : e0000018002063a0 r12 : e0000007059afb60 r13 : e0000007059a0000 r14 : ffffffffffffffd8 r15 : e0000018002062f4 r16 : 0000315801ec75e5 r17 : e000001800206bd0 r18 : e0000018002063a0 r19 : 000000000315801e r20 : e000001800206360 r21 : a0000001014fb630 r22 : e0000018002062e0 r23 : a000000101b2cb88 r24 : e0000007059a0070 r25 : e000001800206b40 r26 : 00000000000001cc r27 : 000000000000bb80 r28 : 000000000000bb7f r29 : 000000000420806c r30 : e0000007059a0014 r31 : 000000000000b9dd Call Trace: [] show_stack+0x80/0xa0 sp=e0000007059af720 bsp=e0000007059a1740 [] show_regs+0x640/0x920 sp=e0000007059af8f0 bsp=e0000007059a16e8 [] die+0x190/0x2c0 sp=e0000007059af900 bsp=e0000007059a16a8 [] ia64_do_page_fault+0x9b0/0xc00 sp=e0000007059af900 bsp=e0000007059a1618 [] ia64_native_leave_kernel+0x0/0x270 sp=e0000007059af990 bsp=e0000007059a1618 [] wq_worker_sleeping+0x30/0x180 sp=e0000007059afb60 bsp=e0000007059a15c8 [] __schedule+0x14f0/0x16c0 sp=e0000007059afb60 bsp=e0000007059a1458 [] schedule+0x60/0x140 sp=e0000007059afb70 bsp=e0000007059a1400 [] do_exit+0x6d0/0xc20 sp=e0000007059afb70 bsp=e0000007059a13a0 [] die+0x260/0x2c0 sp=e0000007059afb70 bsp=e0000007059a1360 [] ia64_do_page_fault+0xbd0/0xc00 sp=e0000007059afb70 bsp=e0000007059a12d0 [] ia64_native_leave_kernel+0x0/0x270 sp=e0000007059afc00 bsp=e0000007059a12d0 [] pci_destroy_dev+0x60/0x160 sp=e0000007059afdd0 bsp=e0000007059a1298 [] pci_remove_bus_device+0xc0/0xe0 sp=e0000007059afdd0 bsp=e0000007059a1258 [] pci_stop_and_remove_bus_device+0x30/0x60 sp=e0000007059afdd0 bsp=e0000007059a1238 [] remove_callback+0xf0/0x1c0 sp=e0000007059afdd0 bsp=e0000007059a1208 [] sysfs_schedule_callback_work+0x50/0x120 sp=e0000007059afdd0 bsp=e0000007059a11d0 [] process_one_work+0x520/0xa80 sp=e0000007059afdd0 bsp=e0000007059a1140 [] worker_thread+0x330/0xde0 sp=e0000007059afdd0 bsp=e0000007059a1070 [] kthread+0x150/0x180 sp=e0000007059afdd0 bsp=e0000007059a1038 [] call_payload+0x50/0x80 sp=e0000007059afe30 bsp=e0000007059a1020 Fixing recursive fault but reboot is needed! I hope this patch can fix your problem too. > > Thanks > Gu > >> >> Thanks >> >> Yinghai >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > From ba405b9ea86d8ebd4fd9754aef67d986b0835f9a Mon Sep 17 00:00:00 2001 From: Yijing Wang Date: Thu, 28 Feb 2013 19:51:40 +0800 Subject: [PATCH] PCI: check device is_added flag in remove_callback() Currently, remove_store() function use device_schedule_callback() mechanism to do device remove action. It will queue remove_callback() into sysfs_workqueue. If this device was removed by other interfaces like acpiphp/pciehp between device_schedule_callback() function and remove_callback() function. This patch add is_added flag check in remove_callback() to avoid remove a removed device again. +-07.0-[0000:05]--+-00.0 nVidia Corporation GT218 [GeForce G210] | \-00.1 nVidia Corporation High Definition Audio Controller #echo 1 > /sys/bus/pci/devices/0000:05:00.0/remove #echo 0 > /sys/bus/pci/slots/0/power (address: 0000:05:00, slot attached to 0000:00:07.0) Signed-off-by: Yijing Wang --- drivers/pci/pci-sysfs.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 9c6e9bb..6b77133 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -331,7 +331,8 @@ static void remove_callback(struct device *dev) struct pci_dev *pdev = to_pci_dev(dev); mutex_lock(&pci_remove_rescan_mutex); - pci_stop_and_remove_bus_device(pdev); + if (pdev->is_added) + pci_stop_and_remove_bus_device(pdev); mutex_unlock(&pci_remove_rescan_mutex); } -- 1.7.1