diff mbox series

PCI: take the rescan lock when adding devices during host probe

Message ID 20240926130924.36409-1-brgl@bgdev.pl (mailing list archive)
State Superseded
Headers show
Series PCI: take the rescan lock when adding devices during host probe | expand

Commit Message

Bartosz Golaszewski Sept. 26, 2024, 1:09 p.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Since adding the PCI power control code, we may end up with a race
between the pwrctl platform device rescanning the bus and the host
controller probe function. The latter needs to take the rescan lock when
adding devices or may crash.

Reported-by: Konrad Dybcio <konradybcio@kernel.org>
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/pci/probe.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Konrad Dybcio Sept. 30, 2024, 4:09 p.m. UTC | #1
On 26.09.2024 3:09 PM, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or may crash.
> 
> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---

Tested-by: Konrad Dybcio <konradybcio@kernel.org>

Konrad
Bjorn Helgaas Oct. 1, 2024, 9:11 p.m. UTC | #2
On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Since adding the PCI power control code, we may end up with a race
> between the pwrctl platform device rescanning the bus and the host
> controller probe function. The latter needs to take the rescan lock when
> adding devices or may crash.
> 
> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>  drivers/pci/probe.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 4f68414c3086..f1615805f5b0 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
>  	list_for_each_entry(child, &bus->children, node)
>  		pcie_bus_configure_settings(child);
>  
> +	pci_lock_rescan_remove();
>  	pci_bus_add_devices(bus);
> +	pci_unlock_rescan_remove();

Seems like we do need locking here, but don't we need a more
comprehensive change?  There are many other callers of
pci_bus_add_devices(), and most of them look similarly unprotected.

>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(pci_host_probe);
> -- 
> 2.30.2
>
Bartosz Golaszewski Oct. 2, 2024, 8:36 a.m. UTC | #3
On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > Since adding the PCI power control code, we may end up with a race
> > between the pwrctl platform device rescanning the bus and the host
> > controller probe function. The latter needs to take the rescan lock when
> > adding devices or may crash.
> >
> > Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> > Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> >  drivers/pci/probe.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 4f68414c3086..f1615805f5b0 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> >       list_for_each_entry(child, &bus->children, node)
> >               pcie_bus_configure_settings(child);
> >
> > +     pci_lock_rescan_remove();
> >       pci_bus_add_devices(bus);
> > +     pci_unlock_rescan_remove();
>
> Seems like we do need locking here, but don't we need a more
> comprehensive change?  There are many other callers of
> pci_bus_add_devices(), and most of them look similarly unprotected.
>

From a quick glance it looks like the majority of users are specific
drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
and pci_rescan_bus_bridge_resize() are already protected from what I
can tell. I'm not saying that the driver calls shouldn't be fixed but
there's no immediate danger. This however fixes an issue we hit with
PCI core so I'd send it upstream now and then we can think about the
other use-cases.

Bart
Konrad Dybcio Oct. 2, 2024, 8:31 p.m. UTC | #4
On 2.10.2024 10:36 AM, Bartosz Golaszewski wrote:
> On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>>
>> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>>
>>> Since adding the PCI power control code, we may end up with a race
>>> between the pwrctl platform device rescanning the bus and the host
>>> controller probe function. The latter needs to take the rescan lock when
>>> adding devices or may crash.
>>>
>>> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
>>> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>> ---
>>>  drivers/pci/probe.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>> index 4f68414c3086..f1615805f5b0 100644
>>> --- a/drivers/pci/probe.c
>>> +++ b/drivers/pci/probe.c
>>> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
>>>       list_for_each_entry(child, &bus->children, node)
>>>               pcie_bus_configure_settings(child);
>>>
>>> +     pci_lock_rescan_remove();
>>>       pci_bus_add_devices(bus);
>>> +     pci_unlock_rescan_remove();
>>
>> Seems like we do need locking here, but don't we need a more
>> comprehensive change?  There are many other callers of
>> pci_bus_add_devices(), and most of them look similarly unprotected.
>>
> 
> From a quick glance it looks like the majority of users are specific
> drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
> and pci_rescan_bus_bridge_resize() are already protected from what I
> can tell. I'm not saying that the driver calls shouldn't be fixed but
> there's no immediate danger. This however fixes an issue we hit with
> PCI core so I'd send it upstream now and then we can think about the
> other use-cases.

Probably worth showing an example of how this can manifest:

removed a device through sysfs and called bus rescan:

[   63.851901] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[   63.861048] Mem abort info:
[   63.863940]   ESR = 0x0000000096000004
[   63.867822]   EC = 0x25: DABT (current EL), IL = 32 bits
[   63.873291]   SET = 0, FnV = 0
[   63.876440]   EA = 0, S1PTW = 0
[   63.879688]   FSC = 0x04: level 0 translation fault
[   63.884711] Data abort info:
[   63.887697]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[   63.893344]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[   63.898549]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[   63.904009] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000116c36000
[   63.910644] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[   63.917683] Internal error: Oops: 0000000096000004 [#1] SMP
[   63.923413] Modules linked in:
[   63.926561] CPU: 1 UID: 0 PID: 530 Comm: bash Tainted: G        W          <snip> #10176
[   63.938971] Tainted: [W]=WARN
[   63.942037] Hardware name: <snip>
[   63.951239] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   63.958398] pc : __pi_strlen+0x14/0x150
[   63.962350] lr : kernfs_name_hash+0x24/0x88
[   63.966668] sp : ffff800083c43af0
[   63.970089] x29: ffff800083c43af0 x28: ffff519888b83500 x27: 0000000000000000
[   63.977420] x26: 0000000000000000 x25: 0000000000000000 x24: ffff519888b83500
[   63.984751] x23: 0000000000000011 x22: ffff519886bd6040 x21: ffff519886bd5b00
[   63.992081] x20: 0000000000000000 x19: 0000000000000000 x18: ffff80008a0bd0a8
[   63.999410] x17: 0000000000000001 x16: ffff519888b83de8 x15: ffffa08dea3f5bf0
[   64.006741] x14: ffff519888b83de8 x13: ffffffffffffffff x12: 0000000000000028
[   64.014076] x11: ffffa08dea3f5bf0 x10: 0000000000000012 x9 : 0000000000000001
[   64.021403] x8 : 0101010101010101 x7 : ffffa08de7a5e844 x6 : 0000000000000000
[   64.028730] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[   64.036062] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[   64.043390] Call trace:
[   64.045918]  __pi_strlen+0x14/0x150
[   64.049508]  kernfs_find_ns+0x80/0x13c
[   64.053375]  kernfs_remove_by_name_ns+0x54/0xf0
[   64.058036]  sysfs_remove_bin_file+0x24/0x34
[   64.062436]  pci_remove_resource_files+0x3c/0x84
[   64.067190]  pci_remove_sysfs_dev_files+0x28/0x38
[   64.072025]  pci_stop_bus_device+0x8c/0xd8
[   64.076241]  pci_stop_bus_device+0x40/0xd8
[   64.080463]  pci_stop_and_remove_bus_device_locked+0x28/0x48
[   64.086277]  remove_store+0x70/0xb0
[   64.089878]  dev_attr_store+0x20/0x38
[   64.093649]  sysfs_kf_write+0x58/0x78
[   64.097426]  kernfs_fop_write_iter+0xe8/0x184
[   64.101905]  vfs_write+0x2dc/0x308
[   64.105413]  ksys_write+0x7c/0xec
[   64.108834]  __arm64_sys_write+0x24/0x34
[   64.112880]  invoke_syscall+0x48/0x100
[   64.116744]  el0_svc_common+0xb4/0xe8
[   64.120522]  do_el0_svc+0x24/0x34
[   64.123938]  el0_svc+0x58/0xb4
[   64.127085]  el0t_64_sync_handler+0x50/0x12c
[   64.131474]  el0t_64_sync+0x198/0x19c
[   64.135244] Code: 92402c04 b200c3e8 f13fc09f 5400088c (a9400c02) 
[   64.141508] ---[ end trace 0000000000000000 ]---

Konrad
Bjorn Helgaas Oct. 2, 2024, 8:53 p.m. UTC | #5
On Wed, Oct 02, 2024 at 10:31:46PM +0200, Konrad Dybcio wrote:
> On 2.10.2024 10:36 AM, Bartosz Golaszewski wrote:
> > On Tue, Oct 1, 2024 at 11:11 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >>
> >> On Thu, Sep 26, 2024 at 03:09:23PM +0200, Bartosz Golaszewski wrote:
> >>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>>
> >>> Since adding the PCI power control code, we may end up with a race
> >>> between the pwrctl platform device rescanning the bus and the host
> >>> controller probe function. The latter needs to take the rescan lock when
> >>> adding devices or may crash.
> >>>
> >>> Reported-by: Konrad Dybcio <konradybcio@kernel.org>
> >>> Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
> >>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>> ---
> >>>  drivers/pci/probe.c | 2 ++
> >>>  1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >>> index 4f68414c3086..f1615805f5b0 100644
> >>> --- a/drivers/pci/probe.c
> >>> +++ b/drivers/pci/probe.c
> >>> @@ -3105,7 +3105,9 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> >>>       list_for_each_entry(child, &bus->children, node)
> >>>               pcie_bus_configure_settings(child);
> >>>
> >>> +     pci_lock_rescan_remove();
> >>>       pci_bus_add_devices(bus);
> >>> +     pci_unlock_rescan_remove();
> >>
> >> Seems like we do need locking here, but don't we need a more
> >> comprehensive change?  There are many other callers of
> >> pci_bus_add_devices(), and most of them look similarly unprotected.
> > 
> > From a quick glance it looks like the majority of users are specific
> > drivers (controller, hotplug, etc.). The calls inside pci_rescan_bus()
> > and pci_rescan_bus_bridge_resize() are already protected from what I
> > can tell. I'm not saying that the driver calls shouldn't be fixed but
> > there's no immediate danger. This however fixes an issue we hit with
> > PCI core so I'd send it upstream now and then we can think about the
> > other use-cases.

Agreed that all current callers of pci_rescan_bus() and
pci_rescan_bus_bridge_resize() already do their own locking.  Most of
the hotplug drivers that use pci_bus_add_devices() do their own
locking as well.

pci_host_probe() is used by many controller drivers, but my guess is
that a dozen or so controller drivers that use pci_bus_add_devices()
directly without locking are still at risk.  It's sort of an
unfinished project to convert drivers like this over to using
pci_host_probe().

In the meantime, I wish we had a safer interface that could enforce
the locking internally.

> Probably worth showing an example of how this can manifest:
> 
> removed a device through sysfs and called bus rescan:

Thanks for this; I was about to ask for it!  I don't think we need
*all* the details, but something like the following might help people
recognize if we trip over another instance:

> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Internal error: Oops: 0000000096000004 [#1] SMP
> Call trace:
>  __pi_strlen+0x14/0x150
>  kernfs_find_ns+0x80/0x13c
>  kernfs_remove_by_name_ns+0x54/0xf0
>  sysfs_remove_bin_file+0x24/0x34
>  pci_remove_resource_files+0x3c/0x84
>  pci_remove_sysfs_dev_files+0x28/0x38
>  pci_stop_bus_device+0x8c/0xd8
>  pci_stop_bus_device+0x40/0xd8
>  pci_stop_and_remove_bus_device_locked+0x28/0x48
>  remove_store+0x70/0xb0
>  dev_attr_store+0x20/0x38
>  sysfs_kf_write+0x58/0x78
>  kernfs_fop_write_iter+0xe8/0x184
>  vfs_write+0x2dc/0x308

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..f1615805f5b0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3105,7 +3105,9 @@  int pci_host_probe(struct pci_host_bridge *bridge)
 	list_for_each_entry(child, &bus->children, node)
 		pcie_bus_configure_settings(child);
 
+	pci_lock_rescan_remove();
 	pci_bus_add_devices(bus);
+	pci_unlock_rescan_remove();
 	return 0;
 }
 EXPORT_SYMBOL_GPL(pci_host_probe);