diff mbox

PCI/ASPM: Don't remove pcie_link_state until we stop the last device

Message ID 1438229360-370-1-git-send-email-wangyijing@huawei.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yijing Wang July 30, 2015, 4:09 a.m. UTC
Now we stop the pci_bus->devices in reverse order, but in
pcie_aspm_exit_link_state(), we only would do something when
the device is the last one.

void pcie_aspm_exit_link_state(struct pci_dev *pdev)
{
	...
	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
		goto out;
	...
}

So if we have the following pcie tree, system may crash.

[b7-bd]--+-02.0-[bb-bd]--+-00.0-[bc-bd]----01.0-[bd]----00.0  PLX Technology, Inc. Device 0002
                         +-00.1  PLX Technology, Inc. Device 0002
                         +-00.2  PLX Technology, Inc. Device 0002
                         +-00.3  PLX Technology, Inc. Device 0002
                         \-00.4  PLX Technology, Inc. Device 0002

In this case, we would stop bb:00.4 before bb:00.0, so when we touch bb:00.4,
we would call pcie_aspm_exit_link_state(), and free the pcie_link_state.
So when we want to stop bd:00.0 and free related pcie_link_state,
it would try to access the parent pcie_link_state which has been freed.

Part crash call trace:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
CPU 16 Pid: 33262, comm: IVS_PowerOn
RIP: 0010:[<ffffffffa0d7c14f>]  [<ffffffffa0d7c14f>] pcie_config_aspm_link+0x3f/0x100
RSP: 0018:ffff8801bc577790  EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000e7e6
RDX: 000000000000e6e6 RSI: 00000000ffffc5ec RDI: 0000000000000246
RBP: ffff8801bc5777d0 R08: ffff88007b001000 R09: 00000000003fffff
...
Call Trace:
 [<ffffffff8124a542>] pcie_config_aspm_path+0x32/0x60
 [<ffffffffa0d7cc00>] pcie_aspm_exit_link_state+0x160/0x560
 [<ffffffffa0d7c0bc>] pci_stop_bus_device+0x8c/0xe0
 [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
 [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
 [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
 [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
 [<ffffffff8123eca1>] pci_stop_and_remove_bus_device+0x11/0x20
...

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
CC: stable@vger.kernel.org #3.4+
---
 drivers/pci/pcie/aspm.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Comments

Bjorn Helgaas Aug. 29, 2015, 12:20 p.m. UTC | #1
Hi Yijing,

On Thu, Jul 30, 2015 at 12:09:20PM +0800, Yijing Wang wrote:
> Now we stop the pci_bus->devices in reverse order, but in
> pcie_aspm_exit_link_state(), we only would do something when
> the device is the last one.
> 
> void pcie_aspm_exit_link_state(struct pci_dev *pdev)
> {
> 	...
> 	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))

Ugh.  This was caused by a confusion between two different meanings of
"last":

  1) the element at the end of the list, and
  2) the only remaining element in the list

3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove"),
which added this line, clearly intended the second, but list_is_last()
implements the first.

But that's a trivial problem.  I think the real problem is that the way we
manage ASPM link_state is a complete disaster.  I want to make steps toward
cleaning that up rather than apply band-aids to a broken design.

I struggled to understand this, so I'm going to ramble a bit to see if I
understand the problem correctly.  Your hierarchy is this:

  b7:02.0 bridge to [bus bb-bd]  Downstream Port; ASPM on Link to bus bb
  bb:00.0 bridge to [bus bc-bd]  Switch Upstream Port; no ASPM
  bb:00.1 endpoint
  bb:00.2 endpoint
  bb:00.3 endpoint
  bb:00.4 endpoint
  bc:01.0 bridge to [bus bd]     Switch Downstream Port; ASPM on Link to bus bd
  bd:00.0 endpoint

There are only two Links in this picture:

  1) from b7:02.0 to bb:00.0
  2) from bc:01.0 to bd:00.0

Those are the two Links where ASPM is important.  Bus bc is the switch's
internal bus, so the connection from bb:00.0 to bc:01.0 is not a Link and
ASPM is not applicable.

Both ends of the Link participate in ASPM, but we allocate ASPM link_state
only for the component on the *upstream* end of a Link.  We do the
allocation during enumeration, like this:

  pcie_aspm_init_link_state(pdev=b7:02.0)
    alloc_pcie_link_state(pdev=b7:02.0)
      link = kzalloc(...)
      link->pdev = pdev                       # b7:02.0
      pdev->link_state = link                 # alloc link_state for link #1

  pcie_aspm_init_link_state(pdev=bc:01.0)
    alloc_pcie_link_state(pdev=bc:01.0)
      link = kzalloc(...)
      link->pdev = pdev                       # bc:01.0
      link->parent = pdev->bus->parent->self->link_state      # b7:02.0 link_state
      pdev->link_state = link                 # alloc link_state for link #2

The allocation path makes sense, at least in the sense that we allocate
link_state for device X when we enumerate device X.  Now we remove the tree
rooted at b7:02.0:

  pci_stop_bus_device(pdev=b7:02.0)
    pci_stop_bus_device(pdev=bb:00.4)         # iterate in reverse
      pci_stop_dev(pdev=bb:00.4)
        pcie_aspm_exit_link_state(pdev=bb:00.4)
          parent = pdev->bus->self            # parent=b7:02.0
          link = parent->link_state
          free_link_state(link)               # b7:02.0 link_state
            link->pdev->link_state = NULL
  A         kfree(link)                       # free link_state for #1
    pci_stop_bus_device(pdev=bb:00.3)
      pci_stop_dev(pdev=bb:00.3)
        pcie_aspm_exit_link_state(pdev=bb:00.3)
          parent = pdev->bus->self            # parent=b7:02.0
          return                              # parent->link_state == NULL
    ...
    pci_stop_bus_device(pdev=bb:00.0)
      pci_stop_bus_device(pdev=bc:01.0)
        pci_stop_bus_device(pdev=bd:00.0)
          pci_stop_dev(pdev=bd:00.0)
            pcie_aspm_exit_link_state(pdev=bd:00.0)
              parent = pdev->bus->self        # parent=bc:01.0
              link = parent->link_state       # bc:01.0 link_state
              parent_link = link->parent      # b7:02.0 link_state
              free_link_state(link)           # bc:01.0 link_state
  B             kfree(link)                   # free link_state for #2
  C           pcie_config_aspm_path(b7:02.0 link_state)   # use link_state for #1

At "C", we try to use the b7:02.0 link_state, which we've already
deallocated at "A", so this is a "use-after-free" problem.

What seems wrong to me is that when we're removing device X, we free the
link_state for a *parent* of X.  I think the code would be much simpler and
easier to get right if we freed the link_state for X when we remove X.

Can you look at fixing the problem that way?

> 		goto out;
> 	...
> }
> 
> So if we have the following pcie tree, system may crash.
> 
> [b7-bd]--+-02.0-[bb-bd]--+-00.0-[bc-bd]----01.0-[bd]----00.0  PLX Technology, Inc. Device 0002
>                          +-00.1  PLX Technology, Inc. Device 0002
>                          +-00.2  PLX Technology, Inc. Device 0002
>                          +-00.3  PLX Technology, Inc. Device 0002
>                          \-00.4  PLX Technology, Inc. Device 0002
> 
> In this case, we would stop bb:00.4 before bb:00.0, so when we touch bb:00.4,
> we would call pcie_aspm_exit_link_state(), and free the pcie_link_state.
> So when we want to stop bd:00.0 and free related pcie_link_state,
> it would try to access the parent pcie_link_state which has been freed.
> 
> Part crash call trace:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> CPU 16 Pid: 33262, comm: IVS_PowerOn
> RIP: 0010:[<ffffffffa0d7c14f>]  [<ffffffffa0d7c14f>] pcie_config_aspm_link+0x3f/0x100
> RSP: 0018:ffff8801bc577790  EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000e7e6
> RDX: 000000000000e6e6 RSI: 00000000ffffc5ec RDI: 0000000000000246
> RBP: ffff8801bc5777d0 R08: ffff88007b001000 R09: 00000000003fffff
> ...
> Call Trace:
>  [<ffffffff8124a542>] pcie_config_aspm_path+0x32/0x60
>  [<ffffffffa0d7cc00>] pcie_aspm_exit_link_state+0x160/0x560
>  [<ffffffffa0d7c0bc>] pci_stop_bus_device+0x8c/0xe0
>  [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
>  [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
>  [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
>  [<ffffffffa0d7c068>] pci_stop_bus_device+0x38/0xe0
>  [<ffffffff8123eca1>] pci_stop_and_remove_bus_device+0x11/0x20
> ...
> 
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> CC: stable@vger.kernel.org #3.4+

I need a clue about why you picked v3.4 here.  Is it because ac205b7bb72f
("PCI: make sriov work with hotplug remove") appeared in v3.4?

Bjorn

> ---
>  drivers/pci/pcie/aspm.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 317e355..c81f549 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -648,7 +648,8 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
>  	 * All PCIe functions are in one slot, remove one function will remove
>  	 * the whole slot, so just wait until we are the last function left.
>  	 */
> -	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
> +	if (!(pdev == list_first_entry(&parent->subordinate->devices,
> +					struct pci_dev, bus_list)))
>  		goto out;
>  
>  	link = parent->link_state;
> -- 
> 1.7.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang Aug. 31, 2015, 1:24 a.m. UTC | #2
? 2015/8/29 20:20, Bjorn Helgaas ??:
> Hi Yijing,
> 
> On Thu, Jul 30, 2015 at 12:09:20PM +0800, Yijing Wang wrote:
>> Now we stop the pci_bus->devices in reverse order, but in
>> pcie_aspm_exit_link_state(), we only would do something when
>> the device is the last one.
>>
>> void pcie_aspm_exit_link_state(struct pci_dev *pdev)
>> {
>> 	...
>> 	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
> 
> Ugh.  This was caused by a confusion between two different meanings of
> "last":
> 
>   1) the element at the end of the list, and
>   2) the only remaining element in the list
> 
> 3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove"),
> which added this line, clearly intended the second, but list_is_last()
> implements the first.
> 
> But that's a trivial problem.  I think the real problem is that the way we
> manage ASPM link_state is a complete disaster.  I want to make steps toward
> cleaning that up rather than apply band-aids to a broken design.
> 
> I struggled to understand this, so I'm going to ramble a bit to see if I
> understand the problem correctly.  Your hierarchy is this:
> 
>   b7:02.0 bridge to [bus bb-bd]  Downstream Port; ASPM on Link to bus bb
>   bb:00.0 bridge to [bus bc-bd]  Switch Upstream Port; no ASPM
>   bb:00.1 endpoint
>   bb:00.2 endpoint
>   bb:00.3 endpoint
>   bb:00.4 endpoint
>   bc:01.0 bridge to [bus bd]     Switch Downstream Port; ASPM on Link to bus bd
>   bd:00.0 endpoint
> 
> There are only two Links in this picture:
> 
>   1) from b7:02.0 to bb:00.0
>   2) from bc:01.0 to bd:00.0
> 
> Those are the two Links where ASPM is important.  Bus bc is the switch's
> internal bus, so the connection from bb:00.0 to bc:01.0 is not a Link and
> ASPM is not applicable.
> 
> Both ends of the Link participate in ASPM, but we allocate ASPM link_state
> only for the component on the *upstream* end of a Link.  We do the
> allocation during enumeration, like this:
> 
>   pcie_aspm_init_link_state(pdev=b7:02.0)
>     alloc_pcie_link_state(pdev=b7:02.0)
>       link = kzalloc(...)
>       link->pdev = pdev                       # b7:02.0
>       pdev->link_state = link                 # alloc link_state for link #1
> 
>   pcie_aspm_init_link_state(pdev=bc:01.0)
>     alloc_pcie_link_state(pdev=bc:01.0)
>       link = kzalloc(...)
>       link->pdev = pdev                       # bc:01.0
>       link->parent = pdev->bus->parent->self->link_state      # b7:02.0 link_state
>       pdev->link_state = link                 # alloc link_state for link #2
> 
> The allocation path makes sense, at least in the sense that we allocate
> link_state for device X when we enumerate device X.  Now we remove the tree
> rooted at b7:02.0:
> 
>   pci_stop_bus_device(pdev=b7:02.0)
>     pci_stop_bus_device(pdev=bb:00.4)         # iterate in reverse
>       pci_stop_dev(pdev=bb:00.4)
>         pcie_aspm_exit_link_state(pdev=bb:00.4)
>           parent = pdev->bus->self            # parent=b7:02.0
>           link = parent->link_state
>           free_link_state(link)               # b7:02.0 link_state
>             link->pdev->link_state = NULL
>   A         kfree(link)                       # free link_state for #1
>     pci_stop_bus_device(pdev=bb:00.3)
>       pci_stop_dev(pdev=bb:00.3)
>         pcie_aspm_exit_link_state(pdev=bb:00.3)
>           parent = pdev->bus->self            # parent=b7:02.0
>           return                              # parent->link_state == NULL
>     ...
>     pci_stop_bus_device(pdev=bb:00.0)
>       pci_stop_bus_device(pdev=bc:01.0)
>         pci_stop_bus_device(pdev=bd:00.0)
>           pci_stop_dev(pdev=bd:00.0)
>             pcie_aspm_exit_link_state(pdev=bd:00.0)
>               parent = pdev->bus->self        # parent=bc:01.0
>               link = parent->link_state       # bc:01.0 link_state
>               parent_link = link->parent      # b7:02.0 link_state
>               free_link_state(link)           # bc:01.0 link_state
>   B             kfree(link)                   # free link_state for #2
>   C           pcie_config_aspm_path(b7:02.0 link_state)   # use link_state for #1
> 
> At "C", we try to use the b7:02.0 link_state, which we've already
> deallocated at "A", so this is a "use-after-free" problem.

Yes, I agree with you.


> 
> What seems wrong to me is that when we're removing device X, we free the
> link_state for a *parent* of X.  I think the code would be much simpler and

What I am worried about here is if we hot remove a endpoint device here, and leave
the parent device, so we don't call pci_aspm_exit_link_state() for this link anymore ?

                  pcie link
downstream port ---------- endpoint device


> easier to get right if we freed the link_state for X when we remove X.
> 
> Can you look at fixing the problem that way?

I'm sorry, I don't have the platform now, this issue was found in product department, and they moved
the platform away.

> 
>> 		goto out;
>> 	...
>> }
>>
>> So if we have the following pcie tree, system may crash.
>>
>> [b7-bd]--+-02.0-[bb-bd]--+-00.0-[bc-bd]----01.0-[bd]----00.0  PLX Technology, Inc. Device 0002
>>                          +-00.1  PLX Technology, Inc. Device 0002
>>                          +-00.2  PLX Technology, Inc. Device 0002
>>                          +-00.3  PLX Technology, Inc. Device 0002
...
>> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
>> CC: stable@vger.kernel.org #3.4+
> 
> I need a clue about why you picked v3.4 here.  Is it because ac205b7bb72f
> ("PCI: make sriov work with hotplug remove") appeared in v3.4?

Actually, this issue was found at v3.4 stable kernel, which was introduced in
3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove") I think.

Thanks!
Yijing.


> 
> Bjorn
> 
>> ---
>>  drivers/pci/pcie/aspm.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 317e355..c81f549 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -648,7 +648,8 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
>>  	 * All PCIe functions are in one slot, remove one function will remove
>>  	 * the whole slot, so just wait until we are the last function left.
>>  	 */
>> -	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
>> +	if (!(pdev == list_first_entry(&parent->subordinate->devices,
>> +					struct pci_dev, bus_list)))
>>  		goto out;
>>  
>>  	link = parent->link_state;
>> -- 
>> 1.7.1
>>
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Aug. 31, 2015, 1:56 p.m. UTC | #3
On Mon, Aug 31, 2015 at 09:24:56AM +0800, wangyijing wrote:
> ? 2015/8/29 20:20, Bjorn Helgaas ??:
> > On Thu, Jul 30, 2015 at 12:09:20PM +0800, Yijing Wang wrote:
> >> Now we stop the pci_bus->devices in reverse order, but in
> >> pcie_aspm_exit_link_state(), we only would do something when
> >> the device is the last one.
> >>
> >> void pcie_aspm_exit_link_state(struct pci_dev *pdev)
> >> {
> >> 	...
> >> 	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
> ...

> > What seems wrong to me is that when we're removing device X, we free the
> > link_state for a *parent* of X.  I think the code would be much simpler and
> 
> What I am worried about here is if we hot remove a endpoint device here, and leave
> the parent device, so we don't call pci_aspm_exit_link_state() for this link anymore ?
> 
>                   pcie link
> downstream port ---------- endpoint device

If we remove the endpoint and leave the parent, we do call
pcie_aspm_exit_link_state() for the endpoint.  I'm proposing to change
that path so that when the endpoint is removed, we do whatever is
necessary for changing the ASPM configuration on the link, but leave
the ASPM structure allocated, since the parent still exists.

Today we allocate link_state only when a port has a downstream link
*and* there's a device on the other end of the link.  I'm proposing
that we allocate link_state even if there's no downstream device.

Then the alloc/free path is always tied directly to the parent.  We
can still change ASPM configuration as needed when downstream devices
are added or removed.  It's not a trivial change, but it seems
possible.

> > easier to get right if we freed the link_state for X when we remove X.
> > 
> > Can you look at fixing the problem that way?
> 
> I'm sorry, I don't have the platform now, this issue was found in product department, and they moved
> the platform away.

OK.  I'll drop this patch for now.  I definitely agree this is a
problem, but the fact that you don't have the platform any more
doesn't mean we need to throw in a band-aid instead of designing a
better solution.

This problem should be pretty easy to reproduce on any platform,
possibly with a little bit of scaffolding to tweak the topology.

> >> 		goto out;
> >> 	...
> >> }
> >>
> >> So if we have the following pcie tree, system may crash.
> >>
> >> [b7-bd]--+-02.0-[bb-bd]--+-00.0-[bc-bd]----01.0-[bd]----00.0  PLX Technology, Inc. Device 0002
> >>                          +-00.1  PLX Technology, Inc. Device 0002
> >>                          +-00.2  PLX Technology, Inc. Device 0002
> >>                          +-00.3  PLX Technology, Inc. Device 0002
> ...
> >> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> >> CC: stable@vger.kernel.org #3.4+
> > 
> > I need a clue about why you picked v3.4 here.  Is it because ac205b7bb72f
> > ("PCI: make sriov work with hotplug remove") appeared in v3.4?
> 
> Actually, this issue was found at v3.4 stable kernel, which was introduced in
> 3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove") I think.

Did you bisect to this or figure it out by analysis?  It's good to
mention the actual commit that broke it so people backporting can
figure out if they need the fix.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang Sept. 1, 2015, 12:59 a.m. UTC | #4
? 2015/8/31 21:56, Bjorn Helgaas ??:
> On Mon, Aug 31, 2015 at 09:24:56AM +0800, wangyijing wrote:
>> ? 2015/8/29 20:20, Bjorn Helgaas ??:
>>> On Thu, Jul 30, 2015 at 12:09:20PM +0800, Yijing Wang wrote:
>>>> Now we stop the pci_bus->devices in reverse order, but in
>>>> pcie_aspm_exit_link_state(), we only would do something when
>>>> the device is the last one.
>>>>
>>>> void pcie_aspm_exit_link_state(struct pci_dev *pdev)
>>>> {
>>>> 	...
>>>> 	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
>> ...
> 
>>> What seems wrong to me is that when we're removing device X, we free the
>>> link_state for a *parent* of X.  I think the code would be much simpler and
>>
>> What I am worried about here is if we hot remove a endpoint device here, and leave
>> the parent device, so we don't call pci_aspm_exit_link_state() for this link anymore ?
>>
>>                   pcie link
>> downstream port ---------- endpoint device
> 
> If we remove the endpoint and leave the parent, we do call
> pcie_aspm_exit_link_state() for the endpoint.  I'm proposing to change
> that path so that when the endpoint is removed, we do whatever is
> necessary for changing the ASPM configuration on the link, but leave
> the ASPM structure allocated, since the parent still exists.
> 
> Today we allocate link_state only when a port has a downstream link
> *and* there's a device on the other end of the link.  I'm proposing
> that we allocate link_state even if there's no downstream device.
> 
> Then the alloc/free path is always tied directly to the parent.  We
> can still change ASPM configuration as needed when downstream devices
> are added or removed.  It's not a trivial change, but it seems
> possible.
> 

Basically, I agree with you, but it's not a trival change, and now
we have no platform in hand, so I think it's better to leave it until we have
platform to reproduce it.


>>> easier to get right if we freed the link_state for X when we remove X.
>>>
>>> Can you look at fixing the problem that way?
>>
>> I'm sorry, I don't have the platform now, this issue was found in product department, and they moved
>> the platform away.
> 
> OK.  I'll drop this patch for now.  I definitely agree this is a
> problem, but the fact that you don't have the platform any more
> doesn't mean we need to throw in a band-aid instead of designing a
> better solution.

OK.

> 
> This problem should be pretty easy to reproduce on any platform,
> possibly with a little bit of scaffolding to tweak the topology.
> 
>>>> 		goto out;
>>>> 	...
>>>> }
>>>>
>>>> So if we have the following pcie tree, system may crash.
>>>>
>>>> [b7-bd]--+-02.0-[bb-bd]--+-00.0-[bc-bd]----01.0-[bd]----00.0  PLX Technology, Inc. Device 0002
>>>>                          +-00.1  PLX Technology, Inc. Device 0002
>>>>                          +-00.2  PLX Technology, Inc. Device 0002
>>>>                          +-00.3  PLX Technology, Inc. Device 0002
>> ...
>>>> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
>>>> CC: stable@vger.kernel.org #3.4+
>>>
>>> I need a clue about why you picked v3.4 here.  Is it because ac205b7bb72f
>>> ("PCI: make sriov work with hotplug remove") appeared in v3.4?
>>
>> Actually, this issue was found at v3.4 stable kernel, which was introduced in
>> 3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove") I think.
> 
> Did you bisect to this or figure it out by analysis?  It's good to
> mention the actual commit that broke it so people backporting can
> figure out if they need the fix.

By analysis, if I have the platform agagin, I would try to bisect it.

Thanks!
Yijing.


> 
> Bjorn
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 317e355..c81f549 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -648,7 +648,8 @@  void pcie_aspm_exit_link_state(struct pci_dev *pdev)
 	 * All PCIe functions are in one slot, remove one function will remove
 	 * the whole slot, so just wait until we are the last function left.
 	 */
-	if (!list_is_last(&pdev->bus_list, &parent->subordinate->devices))
+	if (!(pdev == list_first_entry(&parent->subordinate->devices,
+					struct pci_dev, bus_list)))
 		goto out;
 
 	link = parent->link_state;