diff mbox series

[net,v2,1/1] mlxbf_gige: Fix kernel panic at shutdown

Message ID 20230607140335.1512-1-asmaa@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,v2,1/1] mlxbf_gige: Fix kernel panic at shutdown | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers fail 1 blamed authors not CCed: limings@nvidia.com; 1 maintainers not CCed: limings@nvidia.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Asmaa Mnebhi June 7, 2023, 2:03 p.m. UTC
There is a race condition happening during shutdown due to pending napi transactions.
Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as a
result causes a kernel panic.
To fix this during shutdown, invoke mlxbf_gige_remove to disable and dequeue napi.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
---
 .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c    | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Samudrala, Sridhar June 8, 2023, 11:25 p.m. UTC | #1
On 6/7/2023 7:03 AM, Asmaa Mnebhi wrote:
> There is a race condition happening during shutdown due to pending napi transactions.
> Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as a
> result causes a kernel panic.
> To fix this during shutdown, invoke mlxbf_gige_remove to disable and dequeue napi.
> 
> Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> ---
>   .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c    | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> index 694de9513b9f..609d038b034e 100644
> --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> @@ -475,6 +475,9 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
>   {
>   	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
>   
> +	if (!priv)
> +		return 0;
> +
>   	unregister_netdev(priv->netdev);
>   	phy_disconnect(priv->netdev->phydev);
>   	mlxbf_gige_mdio_remove(priv);
> @@ -485,10 +488,7 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
>   
>   static void mlxbf_gige_shutdown(struct platform_device *pdev)
>   {
> -	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> -
> -	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
> -	mlxbf_gige_clean_port(priv);
> +	mlxbf_gige_remove(pdev);
>   }

With this change, do you really need mlxbf_gige_shutdown() as a separate 
function as it is only calling mlxbf_gige_remove()?


>   
>   static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
Leon Romanovsky June 11, 2023, 6:11 p.m. UTC | #2
On Wed, Jun 07, 2023 at 10:03:35AM -0400, Asmaa Mnebhi wrote:
> There is a race condition happening during shutdown due to pending napi transactions.
> Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as a
> result causes a kernel panic.
> To fix this during shutdown, invoke mlxbf_gige_remove to disable and dequeue napi.
> 
> Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> ---
>  .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c    | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> index 694de9513b9f..609d038b034e 100644
> --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> @@ -475,6 +475,9 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
>  {
>  	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
>  
> +	if (!priv)
> +		return 0;
> +

How can this check be correct? You are removing mlxbf_gige driver, priv
should be always exist here.

>  	unregister_netdev(priv->netdev);
>  	phy_disconnect(priv->netdev->phydev);
>  	mlxbf_gige_mdio_remove(priv);
> @@ -485,10 +488,7 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
>  
>  static void mlxbf_gige_shutdown(struct platform_device *pdev)
>  {
> -	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> -
> -	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
> -	mlxbf_gige_clean_port(priv);
> +	mlxbf_gige_remove(pdev);
>  }
>  
>  static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
> -- 
> 2.30.1
> 
>
Fijalkowski, Maciej June 12, 2023, 11:34 a.m. UTC | #3
On Sun, Jun 11, 2023 at 09:11:25PM +0300, Leon Romanovsky wrote:
> On Wed, Jun 07, 2023 at 10:03:35AM -0400, Asmaa Mnebhi wrote:
> > There is a race condition happening during shutdown due to pending napi transactions.
> > Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as a
> > result causes a kernel panic.
> > To fix this during shutdown, invoke mlxbf_gige_remove to disable and dequeue napi.
> > 
> > Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> > ---
> >  .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c    | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > index 694de9513b9f..609d038b034e 100644
> > --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > @@ -475,6 +475,9 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
> >  {
> >  	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> >  
> > +	if (!priv)
> > +		return 0;
> > +
> 
> How can this check be correct? You are removing mlxbf_gige driver, priv
> should be always exist here.

Asmaa please include v1->v2 diff next time.

Leon, look at v1 discussion:
https://lore.kernel.org/netdev/CH2PR12MB3895172507E1D42BBD5D4AB9D753A@CH2PR12MB3895.namprd12.prod.outlook.com/

> 
> >  	unregister_netdev(priv->netdev);
> >  	phy_disconnect(priv->netdev->phydev);
> >  	mlxbf_gige_mdio_remove(priv);
> > @@ -485,10 +488,7 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
> >  
> >  static void mlxbf_gige_shutdown(struct platform_device *pdev)
> >  {
> > -	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> > -
> > -	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
> > -	mlxbf_gige_clean_port(priv);
> > +	mlxbf_gige_remove(pdev);
> >  }
> >  
> >  static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
> > -- 
> > 2.30.1
> > 
> > 
>
Leon Romanovsky June 12, 2023, 11:59 a.m. UTC | #4
On Mon, Jun 12, 2023 at 01:34:49PM +0200, Maciej Fijalkowski wrote:
> On Sun, Jun 11, 2023 at 09:11:25PM +0300, Leon Romanovsky wrote:
> > On Wed, Jun 07, 2023 at 10:03:35AM -0400, Asmaa Mnebhi wrote:
> > > There is a race condition happening during shutdown due to pending napi transactions.
> > > Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as a
> > > result causes a kernel panic.
> > > To fix this during shutdown, invoke mlxbf_gige_remove to disable and dequeue napi.
> > > 
> > > Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
> > > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> > > ---
> > >  .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c    | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > > index 694de9513b9f..609d038b034e 100644
> > > --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > > +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
> > > @@ -475,6 +475,9 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
> > >  {
> > >  	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> > >  
> > > +	if (!priv)
> > > +		return 0;
> > > +
> > 
> > How can this check be correct? You are removing mlxbf_gige driver, priv
> > should be always exist here.
> 
> Asmaa please include v1->v2 diff next time.
> 
> Leon, look at v1 discussion:
> https://lore.kernel.org/netdev/CH2PR12MB3895172507E1D42BBD5D4AB9D753A@CH2PR12MB3895.namprd12.prod.outlook.com/

Thanks for the link.

As far as I can tell, the calls to .shutdown() and .remove() are
mutually exclusive. It is impossible to go twice and reach scenario
which Paolo mentioned - double call to unregister_netdevice().

Thanks

> 
> > 
> > >  	unregister_netdev(priv->netdev);
> > >  	phy_disconnect(priv->netdev->phydev);
> > >  	mlxbf_gige_mdio_remove(priv);
> > > @@ -485,10 +488,7 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
> > >  
> > >  static void mlxbf_gige_shutdown(struct platform_device *pdev)
> > >  {
> > > -	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> > > -
> > > -	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
> > > -	mlxbf_gige_clean_port(priv);
> > > +	mlxbf_gige_remove(pdev);
> > >  }
> > >  
> > >  static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
> > > -- 
> > > 2.30.1
> > > 
> > > 
> >
Vladimir Oltean June 12, 2023, 12:37 p.m. UTC | #5
On Mon, Jun 12, 2023 at 02:59:25PM +0300, Leon Romanovsky wrote:
> As far as I can tell, the calls to .shutdown() and .remove() are
> mutually exclusive.

In this particular case, or in general?

In general they aren't. If the owning bus driver also implements its .shutdown()
as .remove(), then it will call the .remove() method of all devices on that bus.
That, after .shutdown() had already been called for those same children.
Leon Romanovsky June 12, 2023, 1:17 p.m. UTC | #6
On Mon, Jun 12, 2023 at 03:37:18PM +0300, Vladimir Oltean wrote:
> On Mon, Jun 12, 2023 at 02:59:25PM +0300, Leon Romanovsky wrote:
> > As far as I can tell, the calls to .shutdown() and .remove() are
> > mutually exclusive.
> 
> In this particular case, or in general?
> 
> In general they aren't. If the owning bus driver also implements its .shutdown()
> as .remove(), then it will call the .remove() method of all devices on that bus.
> That, after .shutdown() had already been called for those same children.

Can you please help me to see how? What is the call chain?

From what I see callback to ->shutdown() iterates over all devices in
that bus and relevant bus will check that driver is bound prior to call
to driver callback. In both cases, the driver is removed and bus won't
call to already removed device.

If it does, it is arguably bug in bus logic, which needs to prevent such
scenario.

Thanks
Vladimir Oltean June 12, 2023, 1:28 p.m. UTC | #7
On Mon, Jun 12, 2023 at 04:17:07PM +0300, Leon Romanovsky wrote:
> On Mon, Jun 12, 2023 at 03:37:18PM +0300, Vladimir Oltean wrote:
> > On Mon, Jun 12, 2023 at 02:59:25PM +0300, Leon Romanovsky wrote:
> > > As far as I can tell, the calls to .shutdown() and .remove() are
> > > mutually exclusive.
> > 
> > In this particular case, or in general?
> > 
> > In general they aren't. If the owning bus driver also implements its .shutdown()
> > as .remove(), then it will call the .remove() method of all devices on that bus.
> > That, after .shutdown() had already been called for those same children.
> 
> Can you please help me to see how? What is the call chain?
> 
> From what I see callback to ->shutdown() iterates over all devices in
> that bus and relevant bus will check that driver is bound prior to call
> to driver callback. In both cases, the driver is removed and bus won't
> call to already removed device.

The sequence of operations is:

* device_shutdown() walks the devices_kset backwards, thus shutting down
  children before parents
  * .shutdown() method of child gets called
  * .shutdown() method of parent gets called
    * parent implements .shutdown() as .remove()
      * the parent's .remove() logic calls device_del() on its children
        * .remove() method of child gets called

> If it does, it is arguably bug in bus logic, which needs to prevent such
> scenario.

It's just a consequence of how things work when you reuse the .remove() logic
for .shutdown() without thinking it through. It's a widespread pattern.
Leon Romanovsky June 12, 2023, 1:38 p.m. UTC | #8
On Mon, Jun 12, 2023 at 04:28:41PM +0300, Vladimir Oltean wrote:
> On Mon, Jun 12, 2023 at 04:17:07PM +0300, Leon Romanovsky wrote:
> > On Mon, Jun 12, 2023 at 03:37:18PM +0300, Vladimir Oltean wrote:
> > > On Mon, Jun 12, 2023 at 02:59:25PM +0300, Leon Romanovsky wrote:
> > > > As far as I can tell, the calls to .shutdown() and .remove() are
> > > > mutually exclusive.
> > > 
> > > In this particular case, or in general?
> > > 
> > > In general they aren't. If the owning bus driver also implements its .shutdown()
> > > as .remove(), then it will call the .remove() method of all devices on that bus.
> > > That, after .shutdown() had already been called for those same children.
> > 
> > Can you please help me to see how? What is the call chain?
> > 
> > From what I see callback to ->shutdown() iterates over all devices in
> > that bus and relevant bus will check that driver is bound prior to call
> > to driver callback. In both cases, the driver is removed and bus won't
> > call to already removed device.
> 
> The sequence of operations is:
> 
> * device_shutdown() walks the devices_kset backwards, thus shutting down
>   children before parents
>   * .shutdown() method of child gets called
>   * .shutdown() method of parent gets called
>     * parent implements .shutdown() as .remove()
>       * the parent's .remove() logic calls device_del() on its children
>         * .remove() method of child gets called

But both child and parent are locked so they parent can't call to
child's remove while child is performing shutdown.

BTW, I read the same device_shutdown() function before my first reply
and came to different conclusions from you.

> 
> > If it does, it is arguably bug in bus logic, which needs to prevent such
> > scenario.
> 
> It's just a consequence of how things work when you reuse the .remove() logic
> for .shutdown() without thinking it through. It's a widespread pattern.
Vladimir Oltean June 12, 2023, 2:05 p.m. UTC | #9
On Mon, Jun 12, 2023 at 04:38:53PM +0300, Leon Romanovsky wrote:
> On Mon, Jun 12, 2023 at 04:28:41PM +0300, Vladimir Oltean wrote:
> > The sequence of operations is:
> > 
> > * device_shutdown() walks the devices_kset backwards, thus shutting down
> >   children before parents
> >   * .shutdown() method of child gets called
> >   * .shutdown() method of parent gets called
> >     * parent implements .shutdown() as .remove()
> >       * the parent's .remove() logic calls device_del() on its children
> >         * .remove() method of child gets called
> 
> But both child and parent are locked so they parent can't call to
> child's remove while child is performing shutdown.

Please view the call chain I've posted in an email client capable of
showing the indentation correctly. The 2 lines:

   * .shutdown() method of child gets called
   * .shutdown() method of parent gets called

have the same level of indentation because they occur sequentially
within the same function.

This means 2 things:

1. when the parent runs its .shutdown(), the .shutdown() of the child
   has already finished

2. device_shutdown() only locks "dev" and "dev->parent" for the duration
   of the "dev->driver->shutdown(dev)" procedure. However, the situation
   that you deem impossible due to locking is the dev->driver->shutdown(dev)
   of the parent device. That parent wasn't found through any dev->parent
   pointer, instead it is just another device in the devices_kset list.
   The logic of locking "dev" and "dev->parent" there won't help, since
   we would be locking the parent and the parent of the parent. This
   will obviously not prevent the original parent from calling any
   method of the original child - we're already one step higher in the
   hierarchy.

So your objection above does not really apply.

> BTW, I read the same device_shutdown() function before my first reply
> and came to different conclusions from you.

Well, you could try to experiment with putting ".shutdown = xxx_remove,"
in some bus drivers and see what happens.

Admittedly it was a few years ago, but I did study this problem and I
did have to fix real issues reported by real people based on the above
observations (which here are reproduced only from memory):
https://lore.kernel.org/all/20210920214209.1733768-2-vladimir.oltean@nxp.com/
Jakub Kicinski June 12, 2023, 5:26 p.m. UTC | #10
On Thu, 8 Jun 2023 16:25:16 -0700 Samudrala, Sridhar wrote:
> >   static void mlxbf_gige_shutdown(struct platform_device *pdev)
> >   {
> > -	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
> > -
> > -	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
> > -	mlxbf_gige_clean_port(priv);
> > +	mlxbf_gige_remove(pdev);
> >   }  
> 
> With this change, do you really need mlxbf_gige_shutdown() as a separate 
> function as it is only calling mlxbf_gige_remove()?

Sounds like a fair ask.
Leon Romanovsky June 13, 2023, 7:19 a.m. UTC | #11
On Mon, Jun 12, 2023 at 05:05:21PM +0300, Vladimir Oltean wrote:
> On Mon, Jun 12, 2023 at 04:38:53PM +0300, Leon Romanovsky wrote:
> > On Mon, Jun 12, 2023 at 04:28:41PM +0300, Vladimir Oltean wrote:
> > > The sequence of operations is:
> > > 
> > > * device_shutdown() walks the devices_kset backwards, thus shutting down
> > >   children before parents
> > >   * .shutdown() method of child gets called
> > >   * .shutdown() method of parent gets called
> > >     * parent implements .shutdown() as .remove()
> > >       * the parent's .remove() logic calls device_del() on its children
> > >         * .remove() method of child gets called
> > 
> > But both child and parent are locked so they parent can't call to
> > child's remove while child is performing shutdown.
> 
> Please view the call chain I've posted in an email client capable of
> showing the indentation correctly. 

Thanks for the suggestion, right now I'm using mutt and lore to read
emails. Should I use another email client?

> The 2 lines:
> 
>    * .shutdown() method of child gets called
>    * .shutdown() method of parent gets called
> 
> have the same level of indentation because they occur sequentially
> within the same function.

Right

> 
> This means 2 things:
> 
> 1. when the parent runs its .shutdown(), the .shutdown() of the child
>    has already finished

Right, it is done to make sure we release childs before parents.

> 
> 2. device_shutdown() only locks "dev" and "dev->parent" for the duration
>    of the "dev->driver->shutdown(dev)" procedure. However, the situation
>    that you deem impossible due to locking is the dev->driver->shutdown(dev)
>    of the parent device. That parent wasn't found through any dev->parent
>    pointer, instead it is just another device in the devices_kset list.
>    The logic of locking "dev" and "dev->parent" there won't help, since
>    we would be locking the parent and the parent of the parent. This
>    will obviously not prevent the original parent from calling any
>    method of the original child - we're already one step higher in the
>    hierarchy.

But once child finishes device_shutdown(), it will be removed from devices_kset
list and dev->driver should be NULL at that point for the child. In driver core,
dev->driver is the marker if driver is bound. It means parent/bus won't/shouldn't
call to anything driver related to child which doesn't have valid dev->driver pointer.

> 
> So your objection above does not really apply.

We have a different opinion here.

> 
> > BTW, I read the same device_shutdown() function before my first reply
> > and came to different conclusions from you.
> 
> Well, you could try to experiment with putting ".shutdown = xxx_remove,"
> in some bus drivers and see what happens.

Like I said, this is a bug in bus logic which allows calls to device
which doesn't have driver bound to it.

> 
> Admittedly it was a few years ago, but I did study this problem and I
> did have to fix real issues reported by real people based on the above
> observations (which here are reproduced only from memory):
> https://lore.kernel.org/all/20210920214209.1733768-2-vladimir.oltean@nxp.com/

I believe you, just think that behaviour found in i2c/spi isn't how
device model works.

Thanks
Vladimir Oltean June 13, 2023, 8:30 a.m. UTC | #12
On Tue, Jun 13, 2023 at 10:19:59AM +0300, Leon Romanovsky wrote:
> But once child finishes device_shutdown(), it will be removed from devices_kset
> list and dev->driver should be NULL at that point for the child.

What piece of code would make dev->driver be NULL for devices that have
been shut down by device_shutdown()?
Leon Romanovsky June 13, 2023, 9:09 a.m. UTC | #13
On Tue, Jun 13, 2023 at 11:30:02AM +0300, Vladimir Oltean wrote:
> On Tue, Jun 13, 2023 at 10:19:59AM +0300, Leon Romanovsky wrote:
> > But once child finishes device_shutdown(), it will be removed from devices_kset
> > list and dev->driver should be NULL at that point for the child.
> 
> What piece of code would make dev->driver be NULL for devices that have
> been shut down by device_shutdown()?

You are right here and I'm wrong on that point, dev->driver is set to
NULL in all other places where the device is going to be reused and not
in device_shutdown().

Unfortunately, it doesn't change a lot in our conversation, as device_shutdown()
is very specific call which is called in two flows: kernel_halt() and kernel_restart().

In both flows, it is end game.

Thanks
Vladimir Oltean June 13, 2023, 9:35 a.m. UTC | #14
On Tue, Jun 13, 2023 at 12:09:20PM +0300, Leon Romanovsky wrote:
> On Tue, Jun 13, 2023 at 11:30:02AM +0300, Vladimir Oltean wrote:
> > On Tue, Jun 13, 2023 at 10:19:59AM +0300, Leon Romanovsky wrote:
> > > But once child finishes device_shutdown(), it will be removed from devices_kset
> > > list and dev->driver should be NULL at that point for the child.
> > 
> > What piece of code would make dev->driver be NULL for devices that have
> > been shut down by device_shutdown()?
> 
> You are right here and I'm wrong on that point, dev->driver is set to
> NULL in all other places where the device is going to be reused and not
> in device_shutdown().
> 
> Unfortunately, it doesn't change a lot in our conversation, as device_shutdown()
> is very specific call which is called in two flows: kernel_halt() and kernel_restart().
> 
> In both flows, it is end game.
> 
> Thanks

Except for the fact that, as mentioned in my first reply to this thread,
bus drivers may implement .shutdown() the same way as .remove(), so in
that case, someone *will* unbind the drivers from those child devices,
*after* .shutdown() was called on the child - and if the child device
driver isn't prepared to handle that, it can dereference NULL pointers
and bye bye reboot - the kernel hangs.

Not really sure where you're aiming with your replies at this stage.
Leon Romanovsky June 13, 2023, 10:10 a.m. UTC | #15
On Tue, Jun 13, 2023 at 12:35:01PM +0300, Vladimir Oltean wrote:
> On Tue, Jun 13, 2023 at 12:09:20PM +0300, Leon Romanovsky wrote:
> > On Tue, Jun 13, 2023 at 11:30:02AM +0300, Vladimir Oltean wrote:
> > > On Tue, Jun 13, 2023 at 10:19:59AM +0300, Leon Romanovsky wrote:
> > > > But once child finishes device_shutdown(), it will be removed from devices_kset
> > > > list and dev->driver should be NULL at that point for the child.
> > > 
> > > What piece of code would make dev->driver be NULL for devices that have
> > > been shut down by device_shutdown()?
> > 
> > You are right here and I'm wrong on that point, dev->driver is set to
> > NULL in all other places where the device is going to be reused and not
> > in device_shutdown().
> > 
> > Unfortunately, it doesn't change a lot in our conversation, as device_shutdown()
> > is very specific call which is called in two flows: kernel_halt() and kernel_restart().
> > 
> > In both flows, it is end game.
> > 
> > Thanks
> 
> Except for the fact that, as mentioned in my first reply to this thread,
> bus drivers may implement .shutdown() the same way as .remove(), so in
> that case, someone *will* unbind the drivers from those child devices,
> *after* .shutdown() was called on the child - and if the child device
> driver isn't prepared to handle that, it can dereference NULL pointers
> and bye bye reboot - the kernel hangs.
> 
> Not really sure where you're aiming with your replies at this stage.

My goal is to explain that "bus drivers may implement .shutdown() 
the same way as .remove()" is wrong implementation and expectation
that all drivers will add "if (!priv) return ..." now is not viable.

Thanks
Vladimir Oltean June 13, 2023, 10:34 a.m. UTC | #16
On Tue, Jun 13, 2023 at 01:10:38PM +0300, Leon Romanovsky wrote:
> On Tue, Jun 13, 2023 at 12:35:01PM +0300, Vladimir Oltean wrote:
> > Not really sure where you're aiming with your replies at this stage.
> 
> My goal is to explain that "bus drivers may implement .shutdown() 
> the same way as .remove()" is wrong implementation and expectation
> that all drivers will add "if (!priv) return ..." now is not viable.

I never said that all drivers should guard against that - just that it's
possible and that there is no mechanism to reject such a thing - which
is something you've incorrectly claimed.

The top-level platform bus for MMIO devices should not be do this, at
least as of now - so drivers written for just that should be fine.
However, platform devices created and probed by other drivers, such as
MFD, are worth double-checking.
Leon Romanovsky June 13, 2023, 11:28 a.m. UTC | #17
On Tue, Jun 13, 2023 at 01:34:22PM +0300, Vladimir Oltean wrote:
> On Tue, Jun 13, 2023 at 01:10:38PM +0300, Leon Romanovsky wrote:
> > On Tue, Jun 13, 2023 at 12:35:01PM +0300, Vladimir Oltean wrote:
> > > Not really sure where you're aiming with your replies at this stage.
> > 
> > My goal is to explain that "bus drivers may implement .shutdown() 
> > the same way as .remove()" is wrong implementation and expectation
> > that all drivers will add "if (!priv) return ..." now is not viable.
> 
> I never said that all drivers should guard against that - just that it's
> possible and that there is no mechanism to reject such a thing - which
> is something you've incorrectly claimed.

I was wrong in details, but in general I was correct by saying that call
to .shutdown() and .remove() callbacks are impossible to be performed
at the same time.

https://lore.kernel.org/all/20230612115925.GR12152@unreal

Thanks
Vladimir Oltean June 13, 2023, 11:40 a.m. UTC | #18
On Tue, Jun 13, 2023 at 02:28:41PM +0300, Leon Romanovsky wrote:
> On Tue, Jun 13, 2023 at 01:34:22PM +0300, Vladimir Oltean wrote:
> > On Tue, Jun 13, 2023 at 01:10:38PM +0300, Leon Romanovsky wrote:
> > > On Tue, Jun 13, 2023 at 12:35:01PM +0300, Vladimir Oltean wrote:
> > > > Not really sure where you're aiming with your replies at this stage.
> > > 
> > > My goal is to explain that "bus drivers may implement .shutdown() 
> > > the same way as .remove()" is wrong implementation and expectation
> > > that all drivers will add "if (!priv) return ..." now is not viable.
> > 
> > I never said that all drivers should guard against that - just that it's
> > possible and that there is no mechanism to reject such a thing - which
> > is something you've incorrectly claimed.
> 
> I was wrong in details, but in general I was correct by saying that call
> to .shutdown() and .remove() callbacks are impossible to be performed
> at the same time.
> 
> https://lore.kernel.org/all/20230612115925.GR12152@unreal
> 
> Thanks

Let's stop the conversation here, it is going nowhere. You've given me a
link to a message to which I've responded with exactly the same text as
I would respond now.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
index 694de9513b9f..609d038b034e 100644
--- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
+++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
@@ -475,6 +475,9 @@  static int mlxbf_gige_remove(struct platform_device *pdev)
 {
 	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
 
+	if (!priv)
+		return 0;
+
 	unregister_netdev(priv->netdev);
 	phy_disconnect(priv->netdev->phydev);
 	mlxbf_gige_mdio_remove(priv);
@@ -485,10 +488,7 @@  static int mlxbf_gige_remove(struct platform_device *pdev)
 
 static void mlxbf_gige_shutdown(struct platform_device *pdev)
 {
-	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
-
-	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
-	mlxbf_gige_clean_port(priv);
+	mlxbf_gige_remove(pdev);
 }
 
 static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {