Message ID | 20220901140402.64804-1-csokas.bence@prolan.hu (mailing list archive) |
---|---|
State | Accepted |
Commit | b353b241f1eb9b6265358ffbe2632fdcb563354f |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2] net: fec: Use a spinlock to guard `fep->ptp_clk_on` | expand |
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") This should be removed, there was no issue with just this commit. Am I wrong? One of the reasons of the Fixes tag is to backport to stable releases, and you do not want this commit in 5.15.x. Francesco
On 9/1/2022 7:04 AM, Csókás Bence wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > Fixes: f79959220fa5 ("fec: Restart PPS after link state change") > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> > Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ > Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> > --- [snip] > schedule_delayed_work(&fep->time_keep, HZ); > } > @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, int irq_idx) > } > fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed; > > - spin_lock_init(&fep->tmreg_lock); This change needs to be kept as there is no other code in the driver that would initialize the tmreg_lock otherwise. Try building a kernel with spinlock debugging enabled and you should see it barf with an incorrect spinlock bad magic.
On 2022. 09. 01. 17:04, Francesco Dolcini wrote: > On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: >> Mutexes cannot be taken in a non-preemptible context, >> causing a panic in `fec_ptp_save_state()`. Replacing >> `ptp_clk_mutex` by `tmreg_lock` fixes this. >> >> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > This should be removed, there was no issue with just this commit. Am I > wrong? > > One of the reasons of the Fixes tag is to backport to stable releases, > and you do not want this commit in 5.15.x. I do want it, actually, because I got a PATCH AUTOSEL already for f79959220fa5fbda939592bf91c7a9ea90419040 for 5.15 & 5.19, but I told them not to backport until this fix is merged too. Bence
On 2022. 09. 01. 18:26, Florian Fainelli wrote: > >> schedule_delayed_work(&fep->time_keep, HZ); >> } >> @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, >> int irq_idx) >> } >> fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed; >> - spin_lock_init(&fep->tmreg_lock); > > This change needs to be kept as there is no other code in the driver > that would initialize the tmreg_lock otherwise. Try building a kernel > with spinlock debugging enabled and you should see it barf with an > incorrect spinlock bad magic. `fec_ptp_init()` is called from `fec_probe()`, which init's the spinlock: > @@ -3907,7 +3908,7 @@ fec_probe(struct platform_device *pdev) > } > > fep->ptp_clk_on = false; > - mutex_init(&fep->ptp_clk_mutex); > + spin_lock_init(&fep->tmreg_lock); Bence
On 9/2/2022 12:35 AM, Csókás Bence wrote: > > On 2022. 09. 01. 18:26, Florian Fainelli wrote: >> >>> schedule_delayed_work(&fep->time_keep, HZ); >>> } >>> @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, >>> int irq_idx) >>> } >>> fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed; >>> - spin_lock_init(&fep->tmreg_lock); >> >> This change needs to be kept as there is no other code in the driver >> that would initialize the tmreg_lock otherwise. Try building a kernel >> with spinlock debugging enabled and you should see it barf with an >> incorrect spinlock bad magic. > > `fec_ptp_init()` is called from `fec_probe()`, which init's the spinlock: > > > @@ -3907,7 +3908,7 @@ fec_probe(struct platform_device *pdev) > > } > > > > fep->ptp_clk_on = false; > > - mutex_init(&fep->ptp_clk_mutex); > > + spin_lock_init(&fep->tmreg_lock); Ah indeed, I missed that, thanks!
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > Fixes: f79959220fa5 ("fec: Restart PPS after link state change") > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> > Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ > Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Apalis iMX6
Hello: This patch was applied to netdev/net.git (master) by Jakub Kicinski <kuba@kernel.org>: On Thu, 1 Sep 2022 16:04:03 +0200 you wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > Fixes: f79959220fa5 ("fec: Restart PPS after link state change") > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> > Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ > Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> > > [...] Here is the summary with links: - [v2] net: fec: Use a spinlock to guard `fep->ptp_clk_on` https://git.kernel.org/netdev/net/c/b353b241f1eb You are awesome, thank you!
On 9/1/22 4:04 PM, Csókás Bence wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. I was on holidays, but this doesn't look good. > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > Fixes: f79959220fa5 ("fec: Restart PPS after link state change") > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> > Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ > Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> > --- > drivers/net/ethernet/freescale/fec.h | 1 - > drivers/net/ethernet/freescale/fec_main.c | 17 +++++++------- > drivers/net/ethernet/freescale/fec_ptp.c | 28 ++++++++--------------- > 3 files changed, 19 insertions(+), 27 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h > index 0cebe4b63adb..38f095260e1f 100644 > --- a/drivers/net/ethernet/freescale/fec.h > +++ b/drivers/net/ethernet/freescale/fec.h > @@ -557,7 +557,6 @@ struct fec_enet_private { > struct clk *clk_2x_txclk; > > bool ptp_clk_on; > - struct mutex ptp_clk_mutex; > unsigned int num_tx_queues; > unsigned int num_rx_queues; > > diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c > index b0d60f898249..ab1ee9508f76 100644 > --- a/drivers/net/ethernet/freescale/fec_main.c > +++ b/drivers/net/ethernet/freescale/fec_main.c > @@ -2028,6 +2028,7 @@ static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev) > static int fec_enet_clk_enable(struct net_device *ndev, bool enable) > { > struct fec_enet_private *fep = netdev_priv(ndev); > + unsigned long flags; > int ret; > > if (enable) { > @@ -2036,15 +2037,15 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) > return ret; > > if (fep->clk_ptp) { > - mutex_lock(&fep->ptp_clk_mutex); > + spin_lock_irqsave(&fep->tmreg_lock, flags); > ret = clk_prepare_enable(fep->clk_ptp); clock_prepare() (and thus clk_prepare_enable()) must not be called from atomic context. Marc
On 05.09.2022 09:38:04, Marc Kleine-Budde wrote: > On 9/1/22 4:04 PM, Csókás Bence wrote: > > Mutexes cannot be taken in a non-preemptible context, > > causing a panic in `fec_ptp_save_state()`. Replacing > > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > I was on holidays, but this doesn't look good. Does anyone care to fix this? Csókás? > > @@ -2036,15 +2037,15 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) > > return ret; > > > > if (fep->clk_ptp) { > > - mutex_lock(&fep->ptp_clk_mutex); > > + spin_lock_irqsave(&fep->tmreg_lock, flags); > > ret = clk_prepare_enable(fep->clk_ptp); > > clock_prepare() (and thus clk_prepare_enable()) must not be called from atomic > context. You cannot call clk_prepare_enable() from atomic context. If you compile your kernel with lockdep, you'll get this splat: | [ 5.907789] BUG: sleeping function called from invalid context at drivers/net/ethernet/freescale/fec_main.c:2071 | [ 5.918140] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 | [ 5.926181] preempt_count: 1, expected: 0 | [ 5.930223] 2 locks held by swapper/0/1: | [ 5.934180] #0: c41a4094 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x8c/0x150 | [ 5.941968] #1: c42439b0 (&fep->tmreg_lock){....}-{3:3}, at: fec_enet_clk_enable+0x5c/0x25c | [ 5.950533] irq event stamp: 124698 | [ 5.954052] hardirqs last enabled at (124697): [<c0f6c6dc>] _raw_spin_unlock_irqrestore+0x58/0x6c | [ 5.963058] hardirqs last disabled at (124698): [<c0f6c490>] _raw_spin_lock_irqsave+0x88/0xa4 | [ 5.971622] softirqs last enabled at (122454): [<c02b47bc>] bdi_register_va+0x1ac/0x1d8 | [ 5.979753] softirqs last disabled at (122452): [<c02b4738>] bdi_register_va+0x128/0x1d8 | $ head -2071 drivers/net/ethernet/freescale/fec_main.c | tail -15 | | static int fec_enet_clk_enable(struct net_device *ndev, bool enable) | { | struct fec_enet_private *fep = netdev_priv(ndev); | unsigned long flags; | int ret; | | if (enable) { | ret = clk_prepare_enable(fep->clk_enet_out); | if (ret) | return ret; | | if (fep->clk_ptp) { | spin_lock_irqsave(&fep->tmreg_lock, flags); | ret = clk_prepare_enable(fep->clk_ptp); regards, Marc
On Wed, Sep 07, 2022 at 04:39:15PM +0200, Marc Kleine-Budde wrote: > On 05.09.2022 09:38:04, Marc Kleine-Budde wrote: > > On 9/1/22 4:04 PM, Csókás Bence wrote: > > > Mutexes cannot be taken in a non-preemptible context, > > > causing a panic in `fec_ptp_save_state()`. Replacing > > > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > > > I was on holidays, but this doesn't look good. > > Does anyone care to fix this? Csókás? I do care, however I do not really have resources for anything better than reverting both of the problematic commits. If we do not have a fix, I guess this is the only way. Csókás? For some reason when I tested on imx6 :-( I had no warning, now the change was merged and while doing a quick test on a imx7 I got new issues. [ 37.061582] ======================================================== [ 37.070203] WARNING: possible irq lock inversion dependency detected [ 37.078847] 6.0.0-rc4-00137-g506357871c18 #3 Tainted: G W [ 37.087944] -------------------------------------------------------- [ 37.096615] systemd/580 just changed the state of lock: [ 37.104160] c218f2a4 (&dev->tx_global_lock){+.-.}-{2:2}, at: dev_watchdog+0x18/0x2c4 [ 37.116691] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 37.125896] (&fep->tmreg_lock){+.+.}-{2:2} [ 37.125921] [ 37.125921] [ 37.125921] and interrupts could create inverse lock ordering between them. [ 37.125921] [ 37.153018] [ 37.153018] other info that might help us debug this: [ 37.163824] Possible interrupt unsafe locking scenario: [ 37.163824] [ 37.174714] CPU0 CPU1 [ 37.181215] ---- ---- [ 37.187631] lock(&fep->tmreg_lock); [ 37.193110] local_irq_disable(); [ 37.200841] lock(&dev->tx_global_lock); [ 37.209153] lock(&fep->tmreg_lock); [ 37.217061] <Interrupt> [ 37.221328] lock(&dev->tx_global_lock); [ 37.227167] [ 37.227167] *** DEADLOCK *** that just goes away reverting this patch. Francesco
> On 05.09.2022 09:38:04, Marc Kleine-Budde wrote: > > On 9/1/22 4:04 PM, Csókás Bence wrote: > > > Mutexes cannot be taken in a non-preemptible context, > > > causing a panic in `fec_ptp_save_state()`. Replacing > > > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > > >I was on holidays, but this doesn't look good. > > Does anyone care to fix this? Csókás? Yes, I will attempt to fix it. But, like you the week before, I am also out of office right now, so please be patient. Bence
On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: > Mutexes cannot be taken in a non-preemptible context, > causing a panic in `fec_ptp_save_state()`. Replacing > `ptp_clk_mutex` by `tmreg_lock` fixes this. > > Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") > Fixes: f79959220fa5 ("fec: Restart PPS after link state change") > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> > Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ > Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> For regzbot: This patch results in the following backtrace. [ 18.401688] BUG: sleeping function called from invalid context at drivers/clk/imx/clk-pllv3.c:68 [ 18.402277] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 [ 18.402531] preempt_count: 1, expected: 0 [ 18.402781] 3 locks held by swapper/0/1: [ 18.402967] #0: c423ac8c (&dev->mutex){....}-{3:3}, at: __driver_attach+0x80/0x158 [ 18.404364] #1: c40dc8e8 (&fep->tmreg_lock){....}-{2:2}, at: fec_enet_clk_enable+0x58/0x250 [ 18.404752] #2: c1a71af8 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0xc/0xd4 [ 18.405246] irq event stamp: 129384 [ 18.405403] hardirqs last enabled at (129383): [<c10850b0>] _raw_spin_unlock_irqrestore+0x50/0x64 [ 18.405667] hardirqs last disabled at (129384): [<c1084e70>] _raw_spin_lock_irqsave+0x64/0x68 [ 18.405915] softirqs last enabled at (129218): [<c01017bc>] __do_softirq+0x2ac/0x604 [ 18.406255] softirqs last disabled at (129209): [<c012eee4>] __irq_exit_rcu+0x138/0x17c [ 18.406792] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G N 6.0.0-rc5 #1 [ 18.407131] Hardware name: Freescale i.MX7 Dual (Device Tree) [ 18.407590] unwind_backtrace from show_stack+0x10/0x14 [ 18.407890] show_stack from dump_stack_lvl+0x68/0x90 [ 18.408097] dump_stack_lvl from __might_resched+0x17c/0x284 [ 18.408328] __might_resched from clk_pllv3_wait_lock+0x4c/0xcc [ 18.408557] clk_pllv3_wait_lock from clk_core_prepare+0xc4/0x328 [ 18.408783] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.408986] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.409205] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.409416] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.409631] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.409847] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.410065] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.410284] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.410513] clk_core_prepare from clk_core_prepare+0x50/0x328 [ 18.410729] clk_core_prepare from clk_prepare+0x20/0x30 [ 18.410936] clk_prepare from fec_enet_clk_enable+0x68/0x250 [ 18.411143] fec_enet_clk_enable from fec_probe+0x32c/0x1430 [ 18.411352] fec_probe from platform_probe+0x58/0xbc [ 18.411558] platform_probe from really_probe+0xc4/0x2f4 [ 18.411772] really_probe from __driver_probe_device+0x80/0xe4 [ 18.411983] __driver_probe_device from driver_probe_device+0x2c/0xc4 [ 18.412203] driver_probe_device from __driver_attach+0x8c/0x158 [ 18.412418] __driver_attach from bus_for_each_dev+0x74/0xc0 [ 18.412631] bus_for_each_dev from bus_add_driver+0x154/0x1e8 [ 18.412844] bus_add_driver from driver_register+0x88/0x11c [ 18.413055] driver_register from do_one_initcall+0x68/0x428 [ 18.413271] do_one_initcall from kernel_init_freeable+0x18c/0x240 [ 18.413502] kernel_init_freeable from kernel_init+0x14/0x144 [ 18.413721] kernel_init from ret_from_fork+0x14/0x28 [ 18.414036] Exception stack(0xd0825fb0 to 0xd0825ff8) [ 18.414523] 5fa0: 00000000 00000000 00000000 00000000 [ 18.414792] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 18.415032] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
TWIMC: this mail is primarily send for documentation purposes and for regzbot, my Linux kernel regression tracking bot. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter. On 14.09.22 15:53, Guenter Roeck wrote: > On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: >> Mutexes cannot be taken in a non-preemptible context, >> causing a panic in `fec_ptp_save_state()`. Replacing >> `ptp_clk_mutex` by `tmreg_lock` fixes this. >> >> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") >> Fixes: f79959220fa5 ("fec: Restart PPS after link state change") >> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> >> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ >> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> > > For regzbot: Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced b353b241f1eb9b626 #regzbot title net: fec: backtrace: BUG: sleeping function called from invalid context at drivers/clk/imx/clk-pllv3.c #regzbot ignore-activity #regzbot monitor: https://lore.kernel.org/all/20220912073106.2544207-1-bence98@sch.bme.hu/ #regzbot monitor: https://lore.kernel.org/all/20220912070143.98153-1-francesco.dolcini@toradex.com/ > This patch results in the following backtrace. > > [ 18.401688] BUG: sleeping function called from invalid context at drivers/clk/imx/clk-pllv3.c:68 > [ 18.402277] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 > [ 18.402531] preempt_count: 1, expected: 0 > [ 18.402781] 3 locks held by swapper/0/1: > [ 18.402967] #0: c423ac8c (&dev->mutex){....}-{3:3}, at: __driver_attach+0x80/0x158 > [ 18.404364] #1: c40dc8e8 (&fep->tmreg_lock){....}-{2:2}, at: fec_enet_clk_enable+0x58/0x250 > [ 18.404752] #2: c1a71af8 (prepare_lock){+.+.}-{3:3}, at: clk_prepare_lock+0xc/0xd4 > [ 18.405246] irq event stamp: 129384 > [ 18.405403] hardirqs last enabled at (129383): [<c10850b0>] _raw_spin_unlock_irqrestore+0x50/0x64 > [ 18.405667] hardirqs last disabled at (129384): [<c1084e70>] _raw_spin_lock_irqsave+0x64/0x68 > [ 18.405915] softirqs last enabled at (129218): [<c01017bc>] __do_softirq+0x2ac/0x604 > [ 18.406255] softirqs last disabled at (129209): [<c012eee4>] __irq_exit_rcu+0x138/0x17c > [ 18.406792] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G N 6.0.0-rc5 #1 > [ 18.407131] Hardware name: Freescale i.MX7 Dual (Device Tree) > [ 18.407590] unwind_backtrace from show_stack+0x10/0x14 > [ 18.407890] show_stack from dump_stack_lvl+0x68/0x90 > [ 18.408097] dump_stack_lvl from __might_resched+0x17c/0x284 > [ 18.408328] __might_resched from clk_pllv3_wait_lock+0x4c/0xcc > [ 18.408557] clk_pllv3_wait_lock from clk_core_prepare+0xc4/0x328 > [ 18.408783] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.408986] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.409205] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.409416] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.409631] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.409847] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.410065] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.410284] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.410513] clk_core_prepare from clk_core_prepare+0x50/0x328 > [ 18.410729] clk_core_prepare from clk_prepare+0x20/0x30 > [ 18.410936] clk_prepare from fec_enet_clk_enable+0x68/0x250 > [ 18.411143] fec_enet_clk_enable from fec_probe+0x32c/0x1430 > [ 18.411352] fec_probe from platform_probe+0x58/0xbc > [ 18.411558] platform_probe from really_probe+0xc4/0x2f4 > [ 18.411772] really_probe from __driver_probe_device+0x80/0xe4 > [ 18.411983] __driver_probe_device from driver_probe_device+0x2c/0xc4 > [ 18.412203] driver_probe_device from __driver_attach+0x8c/0x158 > [ 18.412418] __driver_attach from bus_for_each_dev+0x74/0xc0 > [ 18.412631] bus_for_each_dev from bus_add_driver+0x154/0x1e8 > [ 18.412844] bus_add_driver from driver_register+0x88/0x11c > [ 18.413055] driver_register from do_one_initcall+0x68/0x428 > [ 18.413271] do_one_initcall from kernel_init_freeable+0x18c/0x240 > [ 18.413502] kernel_init_freeable from kernel_init+0x14/0x144 > [ 18.413721] kernel_init from ret_from_fork+0x14/0x28 > [ 18.414036] Exception stack(0xd0825fb0 to 0xd0825ff8) > [ 18.414523] 5fa0: 00000000 00000000 00000000 00000000 > [ 18.414792] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > [ 18.415032] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > > >
On 18.09.22 20:59, Thorsten Leemhuis wrote: > TWIMC: this mail is primarily send for documentation purposes and for > regzbot, my Linux kernel regression tracking bot. These mails usually > contain '#forregzbot' in the subject, to make them easy to spot and filter. > > On 14.09.22 15:53, Guenter Roeck wrote: >> On Thu, Sep 01, 2022 at 04:04:03PM +0200, Csókás Bence wrote: >>> Mutexes cannot be taken in a non-preemptible context, >>> causing a panic in `fec_ptp_save_state()`. Replacing >>> `ptp_clk_mutex` by `tmreg_lock` fixes this. >>> >>> Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") >>> Fixes: f79959220fa5 ("fec: Restart PPS after link state change") >>> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> >>> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ >>> Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> >> >> For regzbot: > > Thanks for the report. To be sure below issue doesn't fall through the > cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression > tracking bot: > > #regzbot ^introduced b353b241f1eb9b626 > #regzbot title net: fec: backtrace: BUG: sleeping function called from > invalid context at drivers/clk/imx/clk-pllv3.c > #regzbot ignore-activity > #regzbot monitor: > https://lore.kernel.org/all/20220912073106.2544207-1-bence98@sch.bme.hu/ > #regzbot monitor: > https://lore.kernel.org/all/20220912070143.98153-1-francesco.dolcini@toradex.com/ #regzbot fixed-by: 01b825f997ac28f
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 0cebe4b63adb..38f095260e1f 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -557,7 +557,6 @@ struct fec_enet_private { struct clk *clk_2x_txclk; bool ptp_clk_on; - struct mutex ptp_clk_mutex; unsigned int num_tx_queues; unsigned int num_rx_queues; diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index b0d60f898249..ab1ee9508f76 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -2028,6 +2028,7 @@ static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev) static int fec_enet_clk_enable(struct net_device *ndev, bool enable) { struct fec_enet_private *fep = netdev_priv(ndev); + unsigned long flags; int ret; if (enable) { @@ -2036,15 +2037,15 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) return ret; if (fep->clk_ptp) { - mutex_lock(&fep->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); ret = clk_prepare_enable(fep->clk_ptp); if (ret) { - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); goto failed_clk_ptp; } else { fep->ptp_clk_on = true; } - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); } ret = clk_prepare_enable(fep->clk_ref); @@ -2059,10 +2060,10 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) } else { clk_disable_unprepare(fep->clk_enet_out); if (fep->clk_ptp) { - mutex_lock(&fep->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); clk_disable_unprepare(fep->clk_ptp); fep->ptp_clk_on = false; - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); } clk_disable_unprepare(fep->clk_ref); clk_disable_unprepare(fep->clk_2x_txclk); @@ -2075,10 +2076,10 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) clk_disable_unprepare(fep->clk_ref); failed_clk_ref: if (fep->clk_ptp) { - mutex_lock(&fep->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); clk_disable_unprepare(fep->clk_ptp); fep->ptp_clk_on = false; - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); } failed_clk_ptp: clk_disable_unprepare(fep->clk_enet_out); @@ -3907,7 +3908,7 @@ fec_probe(struct platform_device *pdev) } fep->ptp_clk_on = false; - mutex_init(&fep->ptp_clk_mutex); + spin_lock_init(&fep->tmreg_lock); /* clk_ref is optional, depends on board */ fep->clk_ref = devm_clk_get_optional(&pdev->dev, "enet_clk_ref"); diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index c74d04f4b2fd..8dd5a2615a89 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -365,21 +365,19 @@ static int fec_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta) */ static int fec_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts) { - struct fec_enet_private *adapter = + struct fec_enet_private *fep = container_of(ptp, struct fec_enet_private, ptp_caps); u64 ns; unsigned long flags; - mutex_lock(&adapter->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); /* Check the ptp clock */ - if (!adapter->ptp_clk_on) { - mutex_unlock(&adapter->ptp_clk_mutex); + if (!fep->ptp_clk_on) { + spin_unlock_irqrestore(&fep->tmreg_lock, flags); return -EINVAL; } - spin_lock_irqsave(&adapter->tmreg_lock, flags); - ns = timecounter_read(&adapter->tc); - spin_unlock_irqrestore(&adapter->tmreg_lock, flags); - mutex_unlock(&adapter->ptp_clk_mutex); + ns = timecounter_read(&fep->tc); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); *ts = ns_to_timespec64(ns); @@ -404,10 +402,10 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp, unsigned long flags; u32 counter; - mutex_lock(&fep->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); /* Check the ptp clock */ if (!fep->ptp_clk_on) { - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); return -EINVAL; } @@ -417,11 +415,9 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp, */ counter = ns & fep->cc.mask; - spin_lock_irqsave(&fep->tmreg_lock, flags); writel(counter, fep->hwp + FEC_ATIME); timecounter_init(&fep->tc, &fep->cc, ns); spin_unlock_irqrestore(&fep->tmreg_lock, flags); - mutex_unlock(&fep->ptp_clk_mutex); return 0; } @@ -518,13 +514,11 @@ static void fec_time_keep(struct work_struct *work) struct fec_enet_private *fep = container_of(dwork, struct fec_enet_private, time_keep); unsigned long flags; - mutex_lock(&fep->ptp_clk_mutex); + spin_lock_irqsave(&fep->tmreg_lock, flags); if (fep->ptp_clk_on) { - spin_lock_irqsave(&fep->tmreg_lock, flags); timecounter_read(&fep->tc); - spin_unlock_irqrestore(&fep->tmreg_lock, flags); } - mutex_unlock(&fep->ptp_clk_mutex); + spin_unlock_irqrestore(&fep->tmreg_lock, flags); schedule_delayed_work(&fep->time_keep, HZ); } @@ -599,8 +593,6 @@ void fec_ptp_init(struct platform_device *pdev, int irq_idx) } fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed; - spin_lock_init(&fep->tmreg_lock); - fec_ptp_start_cyclecounter(ndev); INIT_DELAYED_WORK(&fep->time_keep, fec_time_keep);
Mutexes cannot be taken in a non-preemptible context, causing a panic in `fec_ptp_save_state()`. Replacing `ptp_clk_mutex` by `tmreg_lock` fixes this. Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled") Fixes: f79959220fa5 ("fec: Restart PPS after link state change") Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/ Signed-off-by: Csókás Bence <csokas.bence@prolan.hu> --- drivers/net/ethernet/freescale/fec.h | 1 - drivers/net/ethernet/freescale/fec_main.c | 17 +++++++------- drivers/net/ethernet/freescale/fec_ptp.c | 28 ++++++++--------------- 3 files changed, 19 insertions(+), 27 deletions(-)