Message ID | 20180126091845.11902-1-m.szyprowski@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/26, Marek Szyprowski wrote: > If orphaned clock has been already prepared/enabled (for example if it or > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable > counters of the newly assigned parent are not updated correctly. This > might later cause warnings during changing clock parents. This doesn't feel right. Perhaps we should delay enabling a clk if it's CRITICAL until we adopt an orphaned clk. Good news is we have orphan status tracking now so this should be pretty simple. Otherwise migrating the count up is complicated and requires us to call the prepare/enable ops on a critical clk and then keep doing that each time it gets re-parented. Do you have this case, where some clk is marked as CRITICAL, and then we need to migrate that enable/prepare count to the parent? Hopefully it isn't the worser case, where the clk is handed out to some consumer but it's still orphaned at that point, and then we have little control over the migration of state to the parent.
Hi Stephen On 2018-01-27 01:49, Stephen Boyd wrote: > On 01/26, Marek Szyprowski wrote: >> If orphaned clock has been already prepared/enabled (for example if it or >> one of its children has CLK_IS_CRITICAL flag), then the prepare/enable >> counters of the newly assigned parent are not updated correctly. This >> might later cause warnings during changing clock parents. > This doesn't feel right. Perhaps we should delay enabling a clk > if it's CRITICAL until we adopt an orphaned clk. Good news is we > have orphan status tracking now so this should be pretty simple. Not really. It won't be so simple, because we would need to track the status of the whole clock sub-tree for enabling critical clocks. Even then we would need to delay enabling them until the real top clock is registered. > Otherwise migrating the count up is complicated and requires us > to call the prepare/enable ops on a critical clk and then keep > doing that each time it gets re-parented. Do you have this case, > where some clk is marked as CRITICAL, and then we need to migrate > that enable/prepare count to the parent? Yes, this is the case of Exynos5422-based boards, where we have a bunch of critical clocks, which are reparented from internal PLL to external oscillator when respective power domain is suspended. It took me a while to gather all the needed information from the debug logs. Here is one example of such clock tree: fin_pll (clk1, top clock, external pll) fout_dpll (clk2) mout_sclk_dpll (clk3) mout_aclk300_disp1 (clk4) dout_aclk300_disp1 (clk5) mout_sw_aclk300_disp1 (clk6) mout_user_aclk300_disp1 (clk7) aclk300_disp1 (clk8, critical) Before turning off power domains, aclk300_disp1 is reparented directly to fin_pll (this is a hardware requirement). When power domain is turned on again, aclk300_disp1 is reparented back to mout_user_aclk300_disp1. Clocks are registered in the following order: 1, 2, 7, 3, 6, 5, 8, 4. In this case when critical clock is registered, clk4 is not yet available, so clocks 1-3 won't get proper prepare/enable count. Then this causes a warning during reparenting clk8 to clk1: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 clk_core_disable_lock+0x18/0x24 Modules linked in: CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted 4.15.0-next-20180130-dirty #131 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm genpd_power_off_work_fn [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) [<c0125064>] (warn_slowpath_null) from [<c048e650>] (clk_core_disable_lock+0x18/0x24) [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] (clk_core_disable_unprepare+0xc/0x20) [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] (__clk_set_parent_after+0x48/0x4c) [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] (clk_core_set_parent_nolock+0x238/0x5d0) [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] (clk_set_parent+0x38/0x6c) [<c04902e4>] (clk_set_parent) from [<c049d760>] (exynos_pd_power+0x74/0x1cc) [<c049d760>] (exynos_pd_power) from [<c0552064>] (genpd_power_off+0x164/0x264) [<c0552064>] (genpd_power_off) from [<c0552430>] (genpd_power_off_work_fn+0x2c/0x40) [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] (process_one_work+0x1d0/0x7bc) [<c014355c>] (process_one_work) from [<c0143bb4>] (worker_thread+0x34/0x4dc) [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xeeb01fb0 to 0xeeb01ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 48eea511a44c78ef ]--- ------------[ cut here ]------------ WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 clk_core_disable_unprepare+0x18/0x20 Modules linked in: CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W 4.15.0-next-20180130-dirty #131 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm genpd_power_off_work_fn [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) [<c0125064>] (warn_slowpath_null) from [<c048f528>] (clk_core_disable_unprepare+0x18/0x20) [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] (__clk_set_parent_after+0x48/0x4c) [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] (clk_core_set_parent_nolock+0x238/0x5d0) [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] (clk_set_parent+0x38/0x6c) [<c04902e4>] (clk_set_parent) from [<c049d760>] (exynos_pd_power+0x74/0x1cc) [<c049d760>] (exynos_pd_power) from [<c0552064>] (genpd_power_off+0x164/0x264) [<c0552064>] (genpd_power_off) from [<c0552430>] (genpd_power_off_work_fn+0x2c/0x40) [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] (process_one_work+0x1d0/0x7bc) [<c014355c>] (process_one_work) from [<c0143bb4>] (worker_thread+0x34/0x4dc) [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xeeb01fb0 to 0xeeb01ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 48eea511a44c78f0 ]--- Please note that this pattern happens for other critical clocks too. > Hopefully it isn't the worser case, where the clk is handed out > to some consumer but it's still orphaned at that point, and then > we have little control over the migration of state to the parent. Well, in my opinion my patch is simplest fix for the regression introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when reparenting during registration"). It also doesn't have any side-effects as it affects only the situation when orphaned clock has been already prepared/enabled, what in practice means it or one of its children has critical flag. Reworking critical clock handling will be much more complicated. Best regards
Hi Stephen, On 2018-01-30 10:01, Marek Szyprowski wrote: > Hi Stephen > > On 2018-01-27 01:49, Stephen Boyd wrote: >> On 01/26, Marek Szyprowski wrote: >>> If orphaned clock has been already prepared/enabled (for example if >>> it or >>> one of its children has CLK_IS_CRITICAL flag), then the prepare/enable >>> counters of the newly assigned parent are not updated correctly. This >>> might later cause warnings during changing clock parents. >> This doesn't feel right. Perhaps we should delay enabling a clk >> if it's CRITICAL until we adopt an orphaned clk. Good news is we >> have orphan status tracking now so this should be pretty simple. > > Not really. It won't be so simple, because we would need to track the > status of the whole clock sub-tree for enabling critical clocks. Even > then we would need to delay enabling them until the real top clock is > registered. > >> Otherwise migrating the count up is complicated and requires us >> to call the prepare/enable ops on a critical clk and then keep >> doing that each time it gets re-parented. Do you have this case, >> where some clk is marked as CRITICAL, and then we need to migrate >> that enable/prepare count to the parent? > > Yes, this is the case of Exynos5422-based boards, where we have a > bunch of critical clocks, which are reparented from internal PLL to > external oscillator when respective power domain is suspended. > > It took me a while to gather all the needed information from the > debug logs. Here is one example of such clock tree: > > fin_pll (clk1, top clock, external pll) > fout_dpll (clk2) > mout_sclk_dpll (clk3) > mout_aclk300_disp1 (clk4) > dout_aclk300_disp1 (clk5) > mout_sw_aclk300_disp1 (clk6) > mout_user_aclk300_disp1 (clk7) > aclk300_disp1 (clk8, critical) > > Before turning off power domains, aclk300_disp1 is reparented > directly to fin_pll (this is a hardware requirement). When power > domain is turned on again, aclk300_disp1 is reparented back to > mout_user_aclk300_disp1. > > Clocks are registered in the following order: > 1, 2, 7, 3, 6, 5, 8, 4. > > In this case when critical clock is registered, clk4 is not yet > available, so clocks 1-3 won't get proper prepare/enable count. > Then this causes a warning during reparenting clk8 to clk1: > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 > clk_core_disable_lock+0x18/0x24 > Modules linked in: > CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted > 4.15.0-next-20180130-dirty #131 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > Workqueue: pm genpd_power_off_work_fn > [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) > [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) > [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) > [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) > [<c0125064>] (warn_slowpath_null) from [<c048e650>] > (clk_core_disable_lock+0x18/0x24) > [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] > (clk_core_disable_unprepare+0xc/0x20) > [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] > (__clk_set_parent_after+0x48/0x4c) > [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] > (clk_core_set_parent_nolock+0x238/0x5d0) > [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] > (clk_set_parent+0x38/0x6c) > [<c04902e4>] (clk_set_parent) from [<c049d760>] > (exynos_pd_power+0x74/0x1cc) > [<c049d760>] (exynos_pd_power) from [<c0552064>] > (genpd_power_off+0x164/0x264) > [<c0552064>] (genpd_power_off) from [<c0552430>] > (genpd_power_off_work_fn+0x2c/0x40) > [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] > (process_one_work+0x1d0/0x7bc) > [<c014355c>] (process_one_work) from [<c0143bb4>] > (worker_thread+0x34/0x4dc) > [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) > [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > 1fa0: 00000000 00000000 00000000 > 00000000 > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > ---[ end trace 48eea511a44c78ef ]--- > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 > clk_core_disable_unprepare+0x18/0x20 > Modules linked in: > CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W > 4.15.0-next-20180130-dirty #131 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > Workqueue: pm genpd_power_off_work_fn > [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) > [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) > [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) > [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) > [<c0125064>] (warn_slowpath_null) from [<c048f528>] > (clk_core_disable_unprepare+0x18/0x20) > [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] > (__clk_set_parent_after+0x48/0x4c) > [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] > (clk_core_set_parent_nolock+0x238/0x5d0) > [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] > (clk_set_parent+0x38/0x6c) > [<c04902e4>] (clk_set_parent) from [<c049d760>] > (exynos_pd_power+0x74/0x1cc) > [<c049d760>] (exynos_pd_power) from [<c0552064>] > (genpd_power_off+0x164/0x264) > [<c0552064>] (genpd_power_off) from [<c0552430>] > (genpd_power_off_work_fn+0x2c/0x40) > [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] > (process_one_work+0x1d0/0x7bc) > [<c014355c>] (process_one_work) from [<c0143bb4>] > (worker_thread+0x34/0x4dc) > [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) > [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > 1fa0: 00000000 00000000 00000000 > 00000000 > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > ---[ end trace 48eea511a44c78f0 ]--- > > Please note that this pattern happens for other critical clocks too. > >> Hopefully it isn't the worser case, where the clk is handed out >> to some consumer but it's still orphaned at that point, and then >> we have little control over the migration of state to the parent. > > Well, in my opinion my patch is simplest fix for the regression > introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when > reparenting during registration"). It also doesn't have any side-effects > as it affects only the situation when orphaned clock has been already > prepared/enabled, what in practice means it or one of its children has > critical flag. Reworking critical clock handling will be much more > complicated. Stephen, do you have any comments? The issue is there for almost 2 months... v4.16-rc1 is out and its a good time to get a fix in (this or any other if you propose such). Best regards
On Wed, 2018-02-14 at 14:37 +0100, Marek Szyprowski wrote: > Hi Stephen, > > On 2018-01-30 10:01, Marek Szyprowski wrote: > > Hi Stephen > > > > On 2018-01-27 01:49, Stephen Boyd wrote: > > > On 01/26, Marek Szyprowski wrote: > > > > If orphaned clock has been already prepared/enabled (for example if > > > > it or > > > > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable > > > > counters of the newly assigned parent are not updated correctly. This > > > > might later cause warnings during changing clock parents. > > > > > > This doesn't feel right. Perhaps we should delay enabling a clk > > > if it's CRITICAL until we adopt an orphaned clk. Good news is we > > > have orphan status tracking now so this should be pretty simple. > > > > Not really. It won't be so simple, because we would need to track the > > status of the whole clock sub-tree for enabling critical clocks. Even > > then we would need to delay enabling them until the real top clock is > > registered. > > > > > Otherwise migrating the count up is complicated and requires us > > > to call the prepare/enable ops on a critical clk and then keep > > > doing that each time it gets re-parented. Do you have this case, > > > where some clk is marked as CRITICAL, and then we need to migrate > > > that enable/prepare count to the parent? > > > > Yes, this is the case of Exynos5422-based boards, where we have a > > bunch of critical clocks, which are reparented from internal PLL to > > external oscillator when respective power domain is suspended. > > > > It took me a while to gather all the needed information from the > > debug logs. Here is one example of such clock tree: > > > > fin_pll (clk1, top clock, external pll) > > fout_dpll (clk2) > > mout_sclk_dpll (clk3) > > mout_aclk300_disp1 (clk4) > > dout_aclk300_disp1 (clk5) > > mout_sw_aclk300_disp1 (clk6) > > mout_user_aclk300_disp1 (clk7) > > aclk300_disp1 (clk8, critical) > > > > Before turning off power domains, aclk300_disp1 is reparented > > directly to fin_pll (this is a hardware requirement). When power > > domain is turned on again, aclk300_disp1 is reparented back to > > mout_user_aclk300_disp1. > > > > Clocks are registered in the following order: > > 1, 2, 7, 3, 6, 5, 8, 4. > > > > In this case when critical clock is registered, clk4 is not yet > > available, so clocks 1-3 won't get proper prepare/enable count. > > Then this causes a warning during reparenting clk8 to clk1: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 > > clk_core_disable_lock+0x18/0x24 > > Modules linked in: > > CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted > > 4.15.0-next-20180130-dirty #131 > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > Workqueue: pm genpd_power_off_work_fn > > [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) > > [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) > > [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) > > [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) > > [<c0125064>] (warn_slowpath_null) from [<c048e650>] > > (clk_core_disable_lock+0x18/0x24) > > [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] > > (clk_core_disable_unprepare+0xc/0x20) > > [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] > > (__clk_set_parent_after+0x48/0x4c) > > [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] > > (clk_core_set_parent_nolock+0x238/0x5d0) > > [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] > > (clk_set_parent+0x38/0x6c) > > [<c04902e4>] (clk_set_parent) from [<c049d760>] > > (exynos_pd_power+0x74/0x1cc) > > [<c049d760>] (exynos_pd_power) from [<c0552064>] > > (genpd_power_off+0x164/0x264) > > [<c0552064>] (genpd_power_off) from [<c0552430>] > > (genpd_power_off_work_fn+0x2c/0x40) > > [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] > > (process_one_work+0x1d0/0x7bc) > > [<c014355c>] (process_one_work) from [<c0143bb4>] > > (worker_thread+0x34/0x4dc) > > [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) > > [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > > 1fa0: 00000000 00000000 00000000 > > 00000000 > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > > 00000000 > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > ---[ end trace 48eea511a44c78ef ]--- > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 > > clk_core_disable_unprepare+0x18/0x20 > > Modules linked in: > > CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W > > 4.15.0-next-20180130-dirty #131 > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > Workqueue: pm genpd_power_off_work_fn > > [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) > > [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) > > [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) > > [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) > > [<c0125064>] (warn_slowpath_null) from [<c048f528>] > > (clk_core_disable_unprepare+0x18/0x20) > > [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] > > (__clk_set_parent_after+0x48/0x4c) > > [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] > > (clk_core_set_parent_nolock+0x238/0x5d0) > > [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] > > (clk_set_parent+0x38/0x6c) > > [<c04902e4>] (clk_set_parent) from [<c049d760>] > > (exynos_pd_power+0x74/0x1cc) > > [<c049d760>] (exynos_pd_power) from [<c0552064>] > > (genpd_power_off+0x164/0x264) > > [<c0552064>] (genpd_power_off) from [<c0552430>] > > (genpd_power_off_work_fn+0x2c/0x40) > > [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] > > (process_one_work+0x1d0/0x7bc) > > [<c014355c>] (process_one_work) from [<c0143bb4>] > > (worker_thread+0x34/0x4dc) > > [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) > > [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > > Exception stack(0xeeb01fb0 to 0xeeb01ff8) > > 1fa0: 00000000 00000000 00000000 > > 00000000 > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > > 00000000 > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > ---[ end trace 48eea511a44c78f0 ]--- > > > > Please note that this pattern happens for other critical clocks too. > > > > > Hopefully it isn't the worser case, where the clk is handed out > > > to some consumer but it's still orphaned at that point, and then > > > we have little control over the migration of state to the parent. > > > > Well, in my opinion my patch is simplest fix for the regression > > introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when > > reparenting during registration"). It also doesn't have any side-effects > > as it affects only the situation when orphaned clock has been already > > prepared/enabled, what in practice means it or one of its children has > > critical flag. Reworking critical clock handling will be much more > > complicated. > > Stephen, do you have any comments? > > The issue is there for almost 2 months... v4.16-rc1 is out and its > a good time to get a fix in (this or any other if you propose such). > > Best regards Hi Marek, I've sent another patch addressing the same issue (sorry, I did not see your thread earlier) While I agree parent adopting orphan need to take care of any existing count, your solution could be missing a few things, like dealing with CLK_OPS_PARENT_ENABLE clock types Fortunately, __clk_set_parent_before() and __clk_set_parent_after() already deal with those annoying stuff and does count migration well. Would you mind trying this change [0] and let me know if works for you as well ? [0] : https://lkml.kernel.org/r/20180214134340.17242-5-jbrunet@baylibre.com Cheers Jerome
On Fri, 2018-01-26 at 16:49 -0800, Stephen Boyd wrote: > On 01/26, Marek Szyprowski wrote: > > If orphaned clock has been already prepared/enabled (for example if it or > > one of its children has CLK_IS_CRITICAL flag), then the prepare/enable > > counters of the newly assigned parent are not updated correctly. This > > might later cause warnings during changing clock parents. > > This doesn't feel right. Perhaps we should delay enabling a clk > if it's CRITICAL until we adopt an orphaned clk. It does not sounds right. A critical clock should be enabled as soon as possible, regardless of current knowledge of the clock tree. How would you decide when is right time anyway ? The orphaned critical clock could be reparented to clock that is orphan itself, you would get into the same count issue again. > Good news is we > have orphan status tracking now so this should be pretty simple. > Otherwise migrating the count up is complicated and requires us > to call the prepare/enable ops on a critical clk and then keep > doing that each time it gets re-parented. What is the problem with this ? We are going to have call those enable ops anyway. Whether it is done one at a time or all at once, does it really matter ? > Do you have this case, > where some clk is marked as CRITICAL, and then we need to migrate > that enable/prepare count to the parent? yes. > > Hopefully it isn't the worser case, where the clk is handed out > to some consumer but it's still orphaned at that point, and then > we have little control over the migration of state to the parent. > This should not be a problem if the clock reparenting and count migration is done properly. To do this, the __clk_set_parent_before() and __clk_set_parent_after() function look good to me, have I missed anything ?
Hi Jerome, On 2018-02-15 12:44, Jerome Brunet wrote: > On Wed, 2018-02-14 at 14:37 +0100, Marek Szyprowski wrote: >> On 2018-01-30 10:01, Marek Szyprowski wrote: >>> On 2018-01-27 01:49, Stephen Boyd wrote: >>>> On 01/26, Marek Szyprowski wrote: >>>>> If orphaned clock has been already prepared/enabled (for example if >>>>> it or >>>>> one of its children has CLK_IS_CRITICAL flag), then the prepare/enable >>>>> counters of the newly assigned parent are not updated correctly. This >>>>> might later cause warnings during changing clock parents. >>>> This doesn't feel right. Perhaps we should delay enabling a clk >>>> if it's CRITICAL until we adopt an orphaned clk. Good news is we >>>> have orphan status tracking now so this should be pretty simple. >>> Not really. It won't be so simple, because we would need to track the >>> status of the whole clock sub-tree for enabling critical clocks. Even >>> then we would need to delay enabling them until the real top clock is >>> registered. >>> >>>> Otherwise migrating the count up is complicated and requires us >>>> to call the prepare/enable ops on a critical clk and then keep >>>> doing that each time it gets re-parented. Do you have this case, >>>> where some clk is marked as CRITICAL, and then we need to migrate >>>> that enable/prepare count to the parent? >>> Yes, this is the case of Exynos5422-based boards, where we have a >>> bunch of critical clocks, which are reparented from internal PLL to >>> external oscillator when respective power domain is suspended. >>> >>> It took me a while to gather all the needed information from the >>> debug logs. Here is one example of such clock tree: >>> >>> fin_pll (clk1, top clock, external pll) >>> fout_dpll (clk2) >>> mout_sclk_dpll (clk3) >>> mout_aclk300_disp1 (clk4) >>> dout_aclk300_disp1 (clk5) >>> mout_sw_aclk300_disp1 (clk6) >>> mout_user_aclk300_disp1 (clk7) >>> aclk300_disp1 (clk8, critical) >>> >>> Before turning off power domains, aclk300_disp1 is reparented >>> directly to fin_pll (this is a hardware requirement). When power >>> domain is turned on again, aclk300_disp1 is reparented back to >>> mout_user_aclk300_disp1. >>> >>> Clocks are registered in the following order: >>> 1, 2, 7, 3, 6, 5, 8, 4. >>> >>> In this case when critical clock is registered, clk4 is not yet >>> available, so clocks 1-3 won't get proper prepare/enable count. >>> Then this causes a warning during reparenting clk8 to clk1: >>> >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 >>> clk_core_disable_lock+0x18/0x24 >>> Modules linked in: >>> CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted >>> 4.15.0-next-20180130-dirty #131 >>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>> Workqueue: pm genpd_power_off_work_fn >>> [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) >>> [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) >>> [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) >>> [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) >>> [<c0125064>] (warn_slowpath_null) from [<c048e650>] >>> (clk_core_disable_lock+0x18/0x24) >>> [<c048e650>] (clk_core_disable_lock) from [<c048f51c>] >>> (clk_core_disable_unprepare+0xc/0x20) >>> [<c048f51c>] (clk_core_disable_unprepare) from [<c048faec>] >>> (__clk_set_parent_after+0x48/0x4c) >>> [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] >>> (clk_core_set_parent_nolock+0x238/0x5d0) >>> [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] >>> (clk_set_parent+0x38/0x6c) >>> [<c04902e4>] (clk_set_parent) from [<c049d760>] >>> (exynos_pd_power+0x74/0x1cc) >>> [<c049d760>] (exynos_pd_power) from [<c0552064>] >>> (genpd_power_off+0x164/0x264) >>> [<c0552064>] (genpd_power_off) from [<c0552430>] >>> (genpd_power_off_work_fn+0x2c/0x40) >>> [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] >>> (process_one_work+0x1d0/0x7bc) >>> [<c014355c>] (process_one_work) from [<c0143bb4>] >>> (worker_thread+0x34/0x4dc) >>> [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) >>> [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) >>> Exception stack(0xeeb01fb0 to 0xeeb01ff8) >>> 1fa0: 00000000 00000000 00000000 >>> 00000000 >>> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >>> 00000000 >>> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >>> ---[ end trace 48eea511a44c78ef ]--- >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 >>> clk_core_disable_unprepare+0x18/0x20 >>> Modules linked in: >>> CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W >>> 4.15.0-next-20180130-dirty #131 >>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>> Workqueue: pm genpd_power_off_work_fn >>> [<c01114e4>] (unwind_backtrace) from [<c010dbc0>] (show_stack+0x10/0x14) >>> [<c010dbc0>] (show_stack) from [<c09b7b94>] (dump_stack+0x90/0xc8) >>> [<c09b7b94>] (dump_stack) from [<c0124ff8>] (__warn+0xe4/0x110) >>> [<c0124ff8>] (__warn) from [<c0125064>] (warn_slowpath_null+0x40/0x48) >>> [<c0125064>] (warn_slowpath_null) from [<c048f528>] >>> (clk_core_disable_unprepare+0x18/0x20) >>> [<c048f528>] (clk_core_disable_unprepare) from [<c048faec>] >>> (__clk_set_parent_after+0x48/0x4c) >>> [<c048faec>] (__clk_set_parent_after) from [<c048fd28>] >>> (clk_core_set_parent_nolock+0x238/0x5d0) >>> [<c048fd28>] (clk_core_set_parent_nolock) from [<c04902e4>] >>> (clk_set_parent+0x38/0x6c) >>> [<c04902e4>] (clk_set_parent) from [<c049d760>] >>> (exynos_pd_power+0x74/0x1cc) >>> [<c049d760>] (exynos_pd_power) from [<c0552064>] >>> (genpd_power_off+0x164/0x264) >>> [<c0552064>] (genpd_power_off) from [<c0552430>] >>> (genpd_power_off_work_fn+0x2c/0x40) >>> [<c0552430>] (genpd_power_off_work_fn) from [<c014355c>] >>> (process_one_work+0x1d0/0x7bc) >>> [<c014355c>] (process_one_work) from [<c0143bb4>] >>> (worker_thread+0x34/0x4dc) >>> [<c0143bb4>] (worker_thread) from [<c014a348>] (kthread+0x128/0x164) >>> [<c014a348>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) >>> Exception stack(0xeeb01fb0 to 0xeeb01ff8) >>> 1fa0: 00000000 00000000 00000000 >>> 00000000 >>> 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >>> 00000000 >>> 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 >>> ---[ end trace 48eea511a44c78f0 ]--- >>> >>> Please note that this pattern happens for other critical clocks too. >>> >>>> Hopefully it isn't the worser case, where the clk is handed out >>>> to some consumer but it's still orphaned at that point, and then >>>> we have little control over the migration of state to the parent. >>> Well, in my opinion my patch is simplest fix for the regression >>> introduced by commit f8f8f1d04494 ("clk: Don't touch hardware when >>> reparenting during registration"). It also doesn't have any side-effects >>> as it affects only the situation when orphaned clock has been already >>> prepared/enabled, what in practice means it or one of its children has >>> critical flag. Reworking critical clock handling will be much more >>> complicated. >> Stephen, do you have any comments? >> >> The issue is there for almost 2 months... v4.16-rc1 is out and its >> a good time to get a fix in (this or any other if you propose such). >> >> Best regards > Hi Marek, > > I've sent another patch addressing the same issue (sorry, I did not see your > thread earlier) > > While I agree parent adopting orphan need to take care of any existing count, > your solution could be missing a few things, like dealing with > CLK_OPS_PARENT_ENABLE clock types > > Fortunately, __clk_set_parent_before() and __clk_set_parent_after() already deal > with those annoying stuff and does count migration well. > > Would you mind trying this change [0] and let me know if works for you as well ? > > [0] : https://lkml.kernel.org/r/20180214134340.17242-5-jbrunet@baylibre.com I'm fine with your approach. It fixes the issue observed on Exynos5422-based boards. Best regards
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index 0f686a9dac3e..a14d64fa6a28 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -2981,7 +2981,11 @@ static int __clk_core_init(struct clk_core *core) */ if (parent) { /* update the clk tree topology */ + if (orphan->prepare_count) + clk_core_prepare(parent); flags = clk_enable_lock(); + if (orphan->enable_count) + clk_core_enable(parent); clk_reparent(orphan, parent); clk_enable_unlock(flags); __clk_recalc_accuracies(orphan);
If orphaned clock has been already prepared/enabled (for example if it or one of its children has CLK_IS_CRITICAL flag), then the prepare/enable counters of the newly assigned parent are not updated correctly. This might later cause warnings during changing clock parents. This patch fixes following warning on Exynos5422-based boards: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:811 clk_core_disable_lock+0x18/0x24 Modules linked in: CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted 4.15.0-rc7-next-20180115 #106 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm genpd_power_off_work_fn [<c0111504>] (unwind_backtrace) from [<c010dbec>] (show_stack+0x10/0x14) [<c010dbec>] (show_stack) from [<c09b3f34>] (dump_stack+0x90/0xc8) [<c09b3f34>] (dump_stack) from [<c0125060>] (__warn+0xe4/0x110) [<c0125060>] (__warn) from [<c01250cc>] (warn_slowpath_null+0x40/0x48) [<c01250cc>] (warn_slowpath_null) from [<c048d584>] (clk_core_disable_lock+0x18/0x24) [<c048d584>] (clk_core_disable_lock) from [<c048e490>] (clk_core_disable_unprepare+0xc/0x20) [<c048e490>] (clk_core_disable_unprepare) from [<c048ea60>] (__clk_set_parent_after+0x48/0x4c) [<c048ea60>] (__clk_set_parent_after) from [<c048ec88>] (clk_core_set_parent_nolock+0x224/0x5b4) [<c048ec88>] (clk_core_set_parent_nolock) from [<c048f23c>] (clk_set_parent+0x38/0x6c) [<c048f23c>] (clk_set_parent) from [<c049c41c>] (exynos_pd_power+0x74/0x1cc) [<c049c41c>] (exynos_pd_power) from [<c0550768>] (genpd_power_off+0x164/0x264) [<c0550768>] (genpd_power_off) from [<c0550b34>] (genpd_power_off_work_fn+0x2c/0x40) [<c0550b34>] (genpd_power_off_work_fn) from [<c01432e4>] (process_one_work+0x1d0/0x7bc) [<c01432e4>] (process_one_work) from [<c014393c>] (worker_thread+0x34/0x4dc) [<c014393c>] (worker_thread) from [<c014a0d0>] (kthread+0x128/0x164) [<c014a0d0>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xeeb01fb0 to 0xeeb01ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 503c239fb760f17a ]--- ------------[ cut here ]------------ WARNING: CPU: 0 PID: 59 at drivers/clk/clk.c:684 clk_core_disable_unprepare+0x18/0x20 Modules linked in: CPU: 0 PID: 59 Comm: kworker/0:1 Tainted: G W 4.15.0-rc7-next-20180115 #106 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm genpd_power_off_work_fn [<c0111504>] (unwind_backtrace) from [<c010dbec>] (show_stack+0x10/0x14) [<c010dbec>] (show_stack) from [<c09b3f34>] (dump_stack+0x90/0xc8) [<c09b3f34>] (dump_stack) from [<c0125060>] (__warn+0xe4/0x110) [<c0125060>] (__warn) from [<c01250cc>] (warn_slowpath_null+0x40/0x48) [<c01250cc>] (warn_slowpath_null) from [<c048e49c>] (clk_core_disable_unprepare+0x18/0x20) [<c048e49c>] (clk_core_disable_unprepare) from [<c048ea60>] (__clk_set_parent_after+0x48/0x4c) [<c048ea60>] (__clk_set_parent_after) from [<c048ec88>] (clk_core_set_parent_nolock+0x224/0x5b4) [<c048ec88>] (clk_core_set_parent_nolock) from [<c048f23c>] (clk_set_parent+0x38/0x6c) [<c048f23c>] (clk_set_parent) from [<c049c41c>] (exynos_pd_power+0x74/0x1cc) [<c049c41c>] (exynos_pd_power) from [<c0550768>] (genpd_power_off+0x164/0x264) [<c0550768>] (genpd_power_off) from [<c0550b34>] (genpd_power_off_work_fn+0x2c/0x40) [<c0550b34>] (genpd_power_off_work_fn) from [<c01432e4>] (process_one_work+0x1d0/0x7bc) [<c01432e4>] (process_one_work) from [<c014393c>] (worker_thread+0x34/0x4dc) [<c014393c>] (worker_thread) from [<c014a0d0>] (kthread+0x128/0x164) [<c014a0d0>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xeeb01fb0 to 0xeeb01ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 503c239fb760f17b ]--- ------------[ cut here ]------------ Fixes: f8f8f1d04494 ("clk: Don't touch hardware when reparenting during registration") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- v2: - moved clk_core_prepare(parent) before clk_enable_lock() to avoid calling it under a spinlock --- drivers/clk/clk.c | 4 ++++ 1 file changed, 4 insertions(+)