diff mbox series

[v2] net: fec: Use a spinlock to guard `fep->ptp_clk_on`

Message ID 20220901140402.64804-1-csokas.bence@prolan.hu (mailing list archive)
State Accepted
Commit b353b241f1eb9b6265358ffbe2632fdcb563354f
Delegated to: Netdev Maintainers
Headers show
Series [v2] net: fec: Use a spinlock to guard `fep->ptp_clk_on` | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: heiko.thiery@gmail.com; 4 maintainers not CCed: edumazet@google.com qiangqing.zhang@nxp.com heiko.thiery@gmail.com pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 137 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Csókás Bence Sept. 1, 2022, 2:04 p.m. UTC
Mutexes cannot be taken in a non-preemptible context,
causing a panic in `fec_ptp_save_state()`. Replacing
`ptp_clk_mutex` by `tmreg_lock` fixes this.

Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
---
 drivers/net/ethernet/freescale/fec.h      |  1 -
 drivers/net/ethernet/freescale/fec_main.c | 17 +++++++-------
 drivers/net/ethernet/freescale/fec_ptp.c  | 28 ++++++++---------------
 3 files changed, 19 insertions(+), 27 deletions(-)

Comments

Francesco Dolcini Sept. 1, 2022, 3:04 p.m. UTC | #1
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
This should be removed, there was no issue with just this commit. Am I
wrong?

One of the reasons of the Fixes tag is to backport to stable releases,
and you do not want this commit in 5.15.x.

Francesco
Florian Fainelli Sept. 1, 2022, 4:26 p.m. UTC | #2
On 9/1/2022 7:04 AM, Csókás Bence wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
> ---

[snip]

>   	schedule_delayed_work(&fep->time_keep, HZ);
>   }
> @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, int irq_idx)
>   	}
>   	fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed;
>   
> -	spin_lock_init(&fep->tmreg_lock);

This change needs to be kept as there is no other code in the driver 
that would initialize the tmreg_lock otherwise. Try building a kernel 
with spinlock debugging enabled and you should see it barf with an 
incorrect spinlock bad magic.
Csókás Bence Sept. 2, 2022, 7:18 a.m. UTC | #3
On 2022. 09. 01. 17:04, Francesco Dolcini wrote:
> On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
>> Mutexes cannot be taken in a non-preemptible context,
>> causing a panic in `fec_ptp_save_state()`. Replacing
>> `ptp_clk_mutex` by `tmreg_lock` fixes this.
>>
>> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> This should be removed, there was no issue with just this commit. Am I
> wrong?
> 
> One of the reasons of the Fixes tag is to backport to stable releases,
> and you do not want this commit in 5.15.x.

I do want it, actually, because I got a PATCH AUTOSEL already for 
f79959220fa5fbda939592bf91c7a9ea90419040 for 5.15 & 5.19, but I told 
them not to backport until this fix is merged too.

Bence
Csókás Bence Sept. 2, 2022, 7:35 a.m. UTC | #4
On 2022. 09. 01. 18:26, Florian Fainelli wrote:
> 
>>       schedule_delayed_work(&fep->time_keep, HZ);
>>   }
>> @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, 
>> int irq_idx)
>>       }
>>       fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed;
>> -    spin_lock_init(&fep->tmreg_lock);
> 
> This change needs to be kept as there is no other code in the driver 
> that would initialize the tmreg_lock otherwise. Try building a kernel 
> with spinlock debugging enabled and you should see it barf with an 
> incorrect spinlock bad magic.

`fec_ptp_init()` is called from `fec_probe()`, which init's the spinlock:

 > @@ -3907,7 +3908,7 @@ fec_probe(struct platform_device *pdev)
 >         }
 >
 >         fep->ptp_clk_on = false;
 > -       mutex_init(&fep->ptp_clk_mutex);
 > +       spin_lock_init(&fep->tmreg_lock);

Bence
Florian Fainelli Sept. 2, 2022, 3:47 p.m. UTC | #5
On 9/2/2022 12:35 AM, Csókás Bence wrote:
> 
> On 2022. 09. 01. 18:26, Florian Fainelli wrote:
>>
>>>       schedule_delayed_work(&fep->time_keep, HZ);
>>>   }
>>> @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, 
>>> int irq_idx)
>>>       }
>>>       fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed;
>>> -    spin_lock_init(&fep->tmreg_lock);
>>
>> This change needs to be kept as there is no other code in the driver 
>> that would initialize the tmreg_lock otherwise. Try building a kernel 
>> with spinlock debugging enabled and you should see it barf with an 
>> incorrect spinlock bad magic.
> 
> `fec_ptp_init()` is called from `fec_probe()`, which init's the spinlock:
> 
>  > @@ -3907,7 +3908,7 @@ fec_probe(struct platform_device *pdev)
>  >         }
>  >
>  >         fep->ptp_clk_on = false;
>  > -       mutex_init(&fep->ptp_clk_mutex);
>  > +       spin_lock_init(&fep->tmreg_lock);

Ah indeed, I missed that, thanks!
Francesco Dolcini Sept. 2, 2022, 5:04 p.m. UTC | #6
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>

Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Apalis iMX6
patchwork-bot+netdevbpf@kernel.org Sept. 3, 2022, 4:30 a.m. UTC | #7
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 1 Sep 2022 16:04:03 +0200 you wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
> 
> [...]

Here is the summary with links:
  - [v2] net: fec: Use a spinlock to guard `fep->ptp_clk_on`
    https://git.kernel.org/netdev/net/c/b353b241f1eb

You are awesome, thank you!
Marc Kleine-Budde Sept. 5, 2022, 7:38 a.m. UTC | #8
On 9/1/22 4:04 PM, Csókás Bence wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.

I was on holidays, but this doesn't look good.

> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
> ---
>  drivers/net/ethernet/freescale/fec.h      |  1 -
>  drivers/net/ethernet/freescale/fec_main.c | 17 +++++++-------
>  drivers/net/ethernet/freescale/fec_ptp.c  | 28 ++++++++---------------
>  3 files changed, 19 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
> index 0cebe4b63adb..38f095260e1f 100644
> --- a/drivers/net/ethernet/freescale/fec.h
> +++ b/drivers/net/ethernet/freescale/fec.h
> @@ -557,7 +557,6 @@ struct fec_enet_private {
>  	struct clk *clk_2x_txclk;
>  
>  	bool ptp_clk_on;
> -	struct mutex ptp_clk_mutex;
>  	unsigned int num_tx_queues;
>  	unsigned int num_rx_queues;
>  
> diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
> index b0d60f898249..ab1ee9508f76 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -2028,6 +2028,7 @@ static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev)
>  static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
>  {
>  	struct fec_enet_private *fep = netdev_priv(ndev);
> +	unsigned long flags;
>  	int ret;
>  
>  	if (enable) {
> @@ -2036,15 +2037,15 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
>  			return ret;
>  
>  		if (fep->clk_ptp) {
> -			mutex_lock(&fep->ptp_clk_mutex);
> +			spin_lock_irqsave(&fep->tmreg_lock, flags);
>  			ret = clk_prepare_enable(fep->clk_ptp);

clock_prepare() (and thus clk_prepare_enable()) must not be called from atomic
context.

Marc
Marc Kleine-Budde Sept. 7, 2022, 2:39 p.m. UTC | #9
On 05.09.2022 09:38:04, Marc Kleine-Budde wrote:
> On 9/1/22 4:04 PM, Csókás Bence wrote:
> > Mutexes cannot be taken in a non-preemptible context,
> > causing a panic in `fec_ptp_save_state()`. Replacing
> > `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> I was on holidays, but this doesn't look good.

Does anyone care to fix this? Csókás?

> > @@ -2036,15 +2037,15 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
> >  			return ret;
> >  
> >  		if (fep->clk_ptp) {
> > -			mutex_lock(&fep->ptp_clk_mutex);
> > +			spin_lock_irqsave(&fep->tmreg_lock, flags);
> >  			ret = clk_prepare_enable(fep->clk_ptp);
> 
> clock_prepare() (and thus clk_prepare_enable()) must not be called from atomic
> context.

You cannot call clk_prepare_enable() from atomic context. If you compile
your kernel with lockdep, you'll get this splat:

| [    5.907789] BUG: sleeping function called from invalid context at drivers/net/ethernet/freescale/fec_main.c:2071
| [    5.918140] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
| [    5.926181] preempt_count: 1, expected: 0                                                    
| [    5.930223] 2 locks held by swapper/0/1:                                                     
| [    5.934180]  #0: c41a4094 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x8c/0x150
| [    5.941968]  #1: c42439b0 (&fep->tmreg_lock){....}-{3:3}, at: fec_enet_clk_enable+0x5c/0x25c
| [    5.950533] irq event stamp: 124698                                                          
| [    5.954052] hardirqs last  enabled at (124697): [<c0f6c6dc>] _raw_spin_unlock_irqrestore+0x58/0x6c
| [    5.963058] hardirqs last disabled at (124698): [<c0f6c490>] _raw_spin_lock_irqsave+0x88/0xa4
| [    5.971622] softirqs last  enabled at (122454): [<c02b47bc>] bdi_register_va+0x1ac/0x1d8                                                                                                     
| [    5.979753] softirqs last disabled at (122452): [<c02b4738>] bdi_register_va+0x128/0x1d8

| $ head -2071 drivers/net/ethernet/freescale/fec_main.c | tail -15
| 
| static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
| {
|         struct fec_enet_private *fep = netdev_priv(ndev);
|         unsigned long flags;
|         int ret;
| 
|         if (enable) {
|                 ret = clk_prepare_enable(fep->clk_enet_out);
|                 if (ret)
|                         return ret;
| 
|                 if (fep->clk_ptp) {
|                         spin_lock_irqsave(&fep->tmreg_lock, flags);
|                         ret = clk_prepare_enable(fep->clk_ptp);

regards,
Marc
Francesco Dolcini Sept. 9, 2022, 1:56 p.m. UTC | #10
On Wed, Sep 07, 2022 at 04:39:15PM +0200, Marc Kleine-Budde wrote:
> On 05.09.2022 09:38:04, Marc Kleine-Budde wrote:
> > On 9/1/22 4:04 PM, Csókás Bence wrote:
> > > Mutexes cannot be taken in a non-preemptible context,
> > > causing a panic in `fec_ptp_save_state()`. Replacing
> > > `ptp_clk_mutex` by `tmreg_lock` fixes this.
> > 
> > I was on holidays, but this doesn't look good.
> 
> Does anyone care to fix this? Csókás?
I do care, however I do not really have resources for anything better than
reverting both of the problematic commits.

If we do not have a fix, I guess this is the only way. Csókás?

For some reason when I tested on imx6 :-( I had no warning, now the
change was merged and while doing a quick test on a imx7
I got new issues.

[   37.061582] ========================================================
[   37.070203] WARNING: possible irq lock inversion dependency detected
[   37.078847] 6.0.0-rc4-00137-g506357871c18 #3 Tainted: G        W
[   37.087944] --------------------------------------------------------
[   37.096615] systemd/580 just changed the state of lock:
[   37.104160] c218f2a4 (&dev->tx_global_lock){+.-.}-{2:2}, at: dev_watchdog+0x18/0x2c4
[   37.116691] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   37.125896]  (&fep->tmreg_lock){+.+.}-{2:2}
[   37.125921]
[   37.125921]
[   37.125921] and interrupts could create inverse lock ordering between them.
[   37.125921]
[   37.153018]
[   37.153018] other info that might help us debug this:
[   37.163824]  Possible interrupt unsafe locking scenario:
[   37.163824]
[   37.174714]        CPU0                    CPU1
[   37.181215]        ----                    ----
[   37.187631]   lock(&fep->tmreg_lock);
[   37.193110]                                local_irq_disable();
[   37.200841]                                lock(&dev->tx_global_lock);
[   37.209153]                                lock(&fep->tmreg_lock);
[   37.217061]   <Interrupt>
[   37.221328]     lock(&dev->tx_global_lock);
[   37.227167]
[   37.227167]  *** DEADLOCK ***

that just goes away reverting this patch.

Francesco
Csókás Bence Sept. 10, 2022, 5:08 p.m. UTC | #11
> On 05.09.2022 09:38:04, Marc Kleine-Budde wrote:
> > On 9/1/22 4:04 PM, Csókás Bence wrote:
> > > Mutexes cannot be taken in a non-preemptible context,
> > > causing a panic in `fec_ptp_save_state()`. Replacing
> > > `ptp_clk_mutex` by `tmreg_lock` fixes this.
> >
> >I was on holidays, but this doesn't look good.
> 
> Does anyone care to fix this? Csókás?

Yes, I will attempt to fix it. But, like you the week before, I am also out of office right now, so please be patient.

Bence
Guenter Roeck Sept. 14, 2022, 2:53 p.m. UTC | #12
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
> Mutexes cannot be taken in a non-preemptible context,
> causing a panic in `fec_ptp_save_state()`. Replacing
> `ptp_clk_mutex` by `tmreg_lock` fixes this.
> 
> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>

For regzbot:

This patch results in the following backtrace.

[   18.401688] BUG: sleeping function called from invalid context at drivers/clk/imx/clk-pllv3.c:68
[   18.402277] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
[   18.402531] preempt_count: 1, expected: 0
[   18.402781] 3 locks held by swapper/0/1:
[   18.402967]  #0: c423ac8c (&dev->mutex){....}-{3:3}, at: __driver_attach+0x80/0x158
[   18.404364]  #1: c40dc8e8 (&fep->tmreg_lock){....}-{2:2}, at: fec_enet_clk_enable+0x58/0x250
[   18.404752]  #2: c1a71af8 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0xc/0xd4
[   18.405246] irq event stamp: 129384
[   18.405403] hardirqs last  enabled at (129383): [<c10850b0>] _raw_spin_unlock_irqrestore+0x50/0x64
[   18.405667] hardirqs last disabled at (129384): [<c1084e70>] _raw_spin_lock_irqsave+0x64/0x68
[   18.405915] softirqs last  enabled at (129218): [<c01017bc>] __do_softirq+0x2ac/0x604
[   18.406255] softirqs last disabled at (129209): [<c012eee4>] __irq_exit_rcu+0x138/0x17c
[   18.406792] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G                 N 6.0.0-rc5 #1
[   18.407131] Hardware name: Freescale i.MX7 Dual (Device Tree)
[   18.407590]  unwind_backtrace from show_stack+0x10/0x14
[   18.407890]  show_stack from dump_stack_lvl+0x68/0x90
[   18.408097]  dump_stack_lvl from __might_resched+0x17c/0x284
[   18.408328]  __might_resched from clk_pllv3_wait_lock+0x4c/0xcc
[   18.408557]  clk_pllv3_wait_lock from clk_core_prepare+0xc4/0x328
[   18.408783]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.408986]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.409205]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.409416]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.409631]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.409847]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.410065]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.410284]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.410513]  clk_core_prepare from clk_core_prepare+0x50/0x328
[   18.410729]  clk_core_prepare from clk_prepare+0x20/0x30
[   18.410936]  clk_prepare from fec_enet_clk_enable+0x68/0x250
[   18.411143]  fec_enet_clk_enable from fec_probe+0x32c/0x1430
[   18.411352]  fec_probe from platform_probe+0x58/0xbc
[   18.411558]  platform_probe from really_probe+0xc4/0x2f4
[   18.411772]  really_probe from __driver_probe_device+0x80/0xe4
[   18.411983]  __driver_probe_device from driver_probe_device+0x2c/0xc4
[   18.412203]  driver_probe_device from __driver_attach+0x8c/0x158
[   18.412418]  __driver_attach from bus_for_each_dev+0x74/0xc0
[   18.412631]  bus_for_each_dev from bus_add_driver+0x154/0x1e8
[   18.412844]  bus_add_driver from driver_register+0x88/0x11c
[   18.413055]  driver_register from do_one_initcall+0x68/0x428
[   18.413271]  do_one_initcall from kernel_init_freeable+0x18c/0x240
[   18.413502]  kernel_init_freeable from kernel_init+0x14/0x144
[   18.413721]  kernel_init from ret_from_fork+0x14/0x28
[   18.414036] Exception stack(0xd0825fb0 to 0xd0825ff8)
[   18.414523] 5fa0:                                     00000000 00000000 00000000 00000000
[   18.414792] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   18.415032] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Thorsten Leemhuis Sept. 18, 2022, 6:59 p.m. UTC | #13
TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 14.09.22 15:53, Guenter Roeck wrote:
> On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
>> Mutexes cannot be taken in a non-preemptible context,
>> causing a panic in `fec_ptp_save_state()`. Replacing
>> `ptp_clk_mutex` by `tmreg_lock` fixes this.
>>
>> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
>> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
>> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
>> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
>> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
> 
> For regzbot:

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced b353b241f1eb9b626
#regzbot title net: fec: backtrace: BUG: sleeping function called from
invalid context at drivers/clk/imx/clk-pllv3.c
#regzbot ignore-activity
#regzbot monitor:
https://lore.kernel.org/all/20220912073106.2544207-1-bence98@sch.bme.hu/
#regzbot monitor:
https://lore.kernel.org/all/20220912070143.98153-1-francesco.dolcini@toradex.com/

> This patch results in the following backtrace.
> 
> [   18.401688] BUG: sleeping function called from invalid context at drivers/clk/imx/clk-pllv3.c:68
> [   18.402277] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0
> [   18.402531] preempt_count: 1, expected: 0
> [   18.402781] 3 locks held by swapper/0/1:
> [   18.402967]  #0: c423ac8c (&dev->mutex){....}-{3:3}, at: __driver_attach+0x80/0x158
> [   18.404364]  #1: c40dc8e8 (&fep->tmreg_lock){....}-{2:2}, at: fec_enet_clk_enable+0x58/0x250
> [   18.404752]  #2: c1a71af8 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0xc/0xd4
> [   18.405246] irq event stamp: 129384
> [   18.405403] hardirqs last  enabled at (129383): [<c10850b0>] _raw_spin_unlock_irqrestore+0x50/0x64
> [   18.405667] hardirqs last disabled at (129384): [<c1084e70>] _raw_spin_lock_irqsave+0x64/0x68
> [   18.405915] softirqs last  enabled at (129218): [<c01017bc>] __do_softirq+0x2ac/0x604
> [   18.406255] softirqs last disabled at (129209): [<c012eee4>] __irq_exit_rcu+0x138/0x17c
> [   18.406792] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G                 N 6.0.0-rc5 #1
> [   18.407131] Hardware name: Freescale i.MX7 Dual (Device Tree)
> [   18.407590]  unwind_backtrace from show_stack+0x10/0x14
> [   18.407890]  show_stack from dump_stack_lvl+0x68/0x90
> [   18.408097]  dump_stack_lvl from __might_resched+0x17c/0x284
> [   18.408328]  __might_resched from clk_pllv3_wait_lock+0x4c/0xcc
> [   18.408557]  clk_pllv3_wait_lock from clk_core_prepare+0xc4/0x328
> [   18.408783]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.408986]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.409205]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.409416]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.409631]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.409847]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.410065]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.410284]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.410513]  clk_core_prepare from clk_core_prepare+0x50/0x328
> [   18.410729]  clk_core_prepare from clk_prepare+0x20/0x30
> [   18.410936]  clk_prepare from fec_enet_clk_enable+0x68/0x250
> [   18.411143]  fec_enet_clk_enable from fec_probe+0x32c/0x1430
> [   18.411352]  fec_probe from platform_probe+0x58/0xbc
> [   18.411558]  platform_probe from really_probe+0xc4/0x2f4
> [   18.411772]  really_probe from __driver_probe_device+0x80/0xe4
> [   18.411983]  __driver_probe_device from driver_probe_device+0x2c/0xc4
> [   18.412203]  driver_probe_device from __driver_attach+0x8c/0x158
> [   18.412418]  __driver_attach from bus_for_each_dev+0x74/0xc0
> [   18.412631]  bus_for_each_dev from bus_add_driver+0x154/0x1e8
> [   18.412844]  bus_add_driver from driver_register+0x88/0x11c
> [   18.413055]  driver_register from do_one_initcall+0x68/0x428
> [   18.413271]  do_one_initcall from kernel_init_freeable+0x18c/0x240
> [   18.413502]  kernel_init_freeable from kernel_init+0x14/0x144
> [   18.413721]  kernel_init from ret_from_fork+0x14/0x28
> [   18.414036] Exception stack(0xd0825fb0 to 0xd0825ff8)
> [   18.414523] 5fa0:                                     00000000 00000000 00000000 00000000
> [   18.414792] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   18.415032] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> 
> 
> 
>
Thorsten Leemhuis Sept. 22, 2022, 10:58 a.m. UTC | #14
On 18.09.22 20:59, Thorsten Leemhuis wrote:
> TWIMC: this mail is primarily send for documentation purposes and for
> regzbot, my Linux kernel regression tracking bot. These mails usually
> contain '#forregzbot' in the subject, to make them easy to spot and filter.
> 
> On 14.09.22 15:53, Guenter Roeck wrote:
>> On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote:
>>> Mutexes cannot be taken in a non-preemptible context,
>>> causing a panic in `fec_ptp_save_state()`. Replacing
>>> `ptp_clk_mutex` by `tmreg_lock` fixes this.
>>>
>>> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
>>> Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
>>> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
>>> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
>>> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
>>
>> For regzbot:
> 
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
> 
> #regzbot ^introduced b353b241f1eb9b626
> #regzbot title net: fec: backtrace: BUG: sleeping function called from
> invalid context at drivers/clk/imx/clk-pllv3.c
> #regzbot ignore-activity
> #regzbot monitor:
> https://lore.kernel.org/all/20220912073106.2544207-1-bence98@sch.bme.hu/
> #regzbot monitor:
> https://lore.kernel.org/all/20220912070143.98153-1-francesco.dolcini@toradex.com/

#regzbot fixed-by: 01b825f997ac28f
diff mbox series

Patch

diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 0cebe4b63adb..38f095260e1f 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -557,7 +557,6 @@  struct fec_enet_private {
 	struct clk *clk_2x_txclk;
 
 	bool ptp_clk_on;
-	struct mutex ptp_clk_mutex;
 	unsigned int num_tx_queues;
 	unsigned int num_rx_queues;
 
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index b0d60f898249..ab1ee9508f76 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2028,6 +2028,7 @@  static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev)
 static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
 {
 	struct fec_enet_private *fep = netdev_priv(ndev);
+	unsigned long flags;
 	int ret;
 
 	if (enable) {
@@ -2036,15 +2037,15 @@  static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
 			return ret;
 
 		if (fep->clk_ptp) {
-			mutex_lock(&fep->ptp_clk_mutex);
+			spin_lock_irqsave(&fep->tmreg_lock, flags);
 			ret = clk_prepare_enable(fep->clk_ptp);
 			if (ret) {
-				mutex_unlock(&fep->ptp_clk_mutex);
+				spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 				goto failed_clk_ptp;
 			} else {
 				fep->ptp_clk_on = true;
 			}
-			mutex_unlock(&fep->ptp_clk_mutex);
+			spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 		}
 
 		ret = clk_prepare_enable(fep->clk_ref);
@@ -2059,10 +2060,10 @@  static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
 	} else {
 		clk_disable_unprepare(fep->clk_enet_out);
 		if (fep->clk_ptp) {
-			mutex_lock(&fep->ptp_clk_mutex);
+			spin_lock_irqsave(&fep->tmreg_lock, flags);
 			clk_disable_unprepare(fep->clk_ptp);
 			fep->ptp_clk_on = false;
-			mutex_unlock(&fep->ptp_clk_mutex);
+			spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 		}
 		clk_disable_unprepare(fep->clk_ref);
 		clk_disable_unprepare(fep->clk_2x_txclk);
@@ -2075,10 +2076,10 @@  static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
 		clk_disable_unprepare(fep->clk_ref);
 failed_clk_ref:
 	if (fep->clk_ptp) {
-		mutex_lock(&fep->ptp_clk_mutex);
+		spin_lock_irqsave(&fep->tmreg_lock, flags);
 		clk_disable_unprepare(fep->clk_ptp);
 		fep->ptp_clk_on = false;
-		mutex_unlock(&fep->ptp_clk_mutex);
+		spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 	}
 failed_clk_ptp:
 	clk_disable_unprepare(fep->clk_enet_out);
@@ -3907,7 +3908,7 @@  fec_probe(struct platform_device *pdev)
 	}
 
 	fep->ptp_clk_on = false;
-	mutex_init(&fep->ptp_clk_mutex);
+	spin_lock_init(&fep->tmreg_lock);
 
 	/* clk_ref is optional, depends on board */
 	fep->clk_ref = devm_clk_get_optional(&pdev->dev, "enet_clk_ref");
diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c
index c74d04f4b2fd..8dd5a2615a89 100644
--- a/drivers/net/ethernet/freescale/fec_ptp.c
+++ b/drivers/net/ethernet/freescale/fec_ptp.c
@@ -365,21 +365,19 @@  static int fec_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
  */
 static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 {
-	struct fec_enet_private *adapter =
+	struct fec_enet_private *fep =
 	    container_of(ptp, struct fec_enet_private, ptp_caps);
 	u64 ns;
 	unsigned long flags;
 
-	mutex_lock(&adapter->ptp_clk_mutex);
+	spin_lock_irqsave(&fep->tmreg_lock, flags);
 	/* Check the ptp clock */
-	if (!adapter->ptp_clk_on) {
-		mutex_unlock(&adapter->ptp_clk_mutex);
+	if (!fep->ptp_clk_on) {
+		spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 		return -EINVAL;
 	}
-	spin_lock_irqsave(&adapter->tmreg_lock, flags);
-	ns = timecounter_read(&adapter->tc);
-	spin_unlock_irqrestore(&adapter->tmreg_lock, flags);
-	mutex_unlock(&adapter->ptp_clk_mutex);
+	ns = timecounter_read(&fep->tc);
+	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 
 	*ts = ns_to_timespec64(ns);
 
@@ -404,10 +402,10 @@  static int fec_ptp_settime(struct ptp_clock_info *ptp,
 	unsigned long flags;
 	u32 counter;
 
-	mutex_lock(&fep->ptp_clk_mutex);
+	spin_lock_irqsave(&fep->tmreg_lock, flags);
 	/* Check the ptp clock */
 	if (!fep->ptp_clk_on) {
-		mutex_unlock(&fep->ptp_clk_mutex);
+		spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 		return -EINVAL;
 	}
 
@@ -417,11 +415,9 @@  static int fec_ptp_settime(struct ptp_clock_info *ptp,
 	 */
 	counter = ns & fep->cc.mask;
 
-	spin_lock_irqsave(&fep->tmreg_lock, flags);
 	writel(counter, fep->hwp + FEC_ATIME);
 	timecounter_init(&fep->tc, &fep->cc, ns);
 	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
-	mutex_unlock(&fep->ptp_clk_mutex);
 	return 0;
 }
 
@@ -518,13 +514,11 @@  static void fec_time_keep(struct work_struct *work)
 	struct fec_enet_private *fep = container_of(dwork, struct fec_enet_private, time_keep);
 	unsigned long flags;
 
-	mutex_lock(&fep->ptp_clk_mutex);
+	spin_lock_irqsave(&fep->tmreg_lock, flags);
 	if (fep->ptp_clk_on) {
-		spin_lock_irqsave(&fep->tmreg_lock, flags);
 		timecounter_read(&fep->tc);
-		spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 	}
-	mutex_unlock(&fep->ptp_clk_mutex);
+	spin_unlock_irqrestore(&fep->tmreg_lock, flags);
 
 	schedule_delayed_work(&fep->time_keep, HZ);
 }
@@ -599,8 +593,6 @@  void fec_ptp_init(struct platform_device *pdev, int irq_idx)
 	}
 	fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed;
 
-	spin_lock_init(&fep->tmreg_lock);
-
 	fec_ptp_start_cyclecounter(ndev);
 
 	INIT_DELAYED_WORK(&fep->time_keep, fec_time_keep);