Message ID | 20190207122227.19873-1-m.szyprowski@samsung.com (mailing list archive) |
---|---|
Headers | show |
Series | cpufreq/opp: rework regulator initialization | expand |
On 07-02-19, 13:22, Marek Szyprowski wrote: > Dear All, > > Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for > i2c adapters") added a visible warning for an attempt to do i2c transfer > over a suspended i2c bus. This revealed a long standing issue in the > cpufreq-dt driver, which gives a following warning during system > suspend/resume cycle: > > --->8--- > Enabling non-boot CPUs ... > CPU1 is up > CPU2 is up > CPU3 is up > ------------[ cut here ]------------ > WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50 > Modules linked in: > CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G W 5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14) > [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8) > [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124) > [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48) > [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50) > [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4) > [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64) > [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450) > [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68) > [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250) > [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c) > [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90) > [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40) > [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c) > [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278) > [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0) > [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0) > [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c) > [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10) > [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc) > [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320) > [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340) > [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160) > [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > Exception stack(0xe897dfb0 to 0xe897dff8) > dfa0: 00000000 00000000 00000000 00000000 > dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 > irq event stamp: 3865 > hardirqs last enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4 > hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4 > softirqs last enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c > softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168 > ---[ end trace db48b455d924fec2 ]--- > CPU4 is up > CPU5 is up > CPU6 is up > CPU7 is up > --->8--- > > This is a scenario that triggers the above issue: > > 1. system disables non-boot cpu's at the end of system suspend procedure, > 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > 3. early in the system resume procedure all cpus are got back to online > state, > 4. this in turn causes cpufreq to be initialized for the newly onlined > cpus, > 5. cpufreq-dt acquires all its resources (clocks, regulators) during > ->init() callback, > 6. getting regulator require to check its state, what in turn requires > i2c transfer, > 7. during system early resume stage this is not really possible. > > The issue is caused by cpufreq-dt driver not keeping its resources for > the whole driver lifetime and relying that they can be always acquired > at any system context. This problem has been observed on Samsung > Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which > have separate regulators for different CPU clusters. Why don't you get similar problem during suspend? I think you can get it when the CPUs are offlined as I2C would have gone by then. The cpufreq or OPP core can try and run some regulator or genpd or clk calls to disable resources, etc. Even if doesn't happen, it certainly can. Also at resume the cpufreq core may try to change the frequency right from ->init() on certain cases, though not everytime and so the problem can come despite of this series. We guarantee that the resources are available during probe but not during resume, that's where the problem is. @Rafael any suggestions on how to fix this ?
Hi Viresh, On 2019-02-08 07:49, Viresh Kumar wrote: > On 07-02-19, 13:22, Marek Szyprowski wrote: >> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for >> i2c adapters") added a visible warning for an attempt to do i2c transfer >> over a suspended i2c bus. This revealed a long standing issue in the >> cpufreq-dt driver, which gives a following warning during system >> suspend/resume cycle: >> >> --->8--- >> Enabling non-boot CPUs ... >> CPU1 is up >> CPU2 is up >> CPU3 is up >> ------------[ cut here ]------------ >> WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50 >> Modules linked in: >> CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G W 5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324 >> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >> [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14) >> [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8) >> [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124) >> [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48) >> [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50) >> [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4) >> [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64) >> [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450) >> [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68) >> [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250) >> [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c) >> [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90) >> [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40) >> [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c) >> [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278) >> [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0) >> [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0) >> [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c) >> [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10) >> [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc) >> [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320) >> [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340) >> [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160) >> [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) >> Exception stack(0xe897dfb0 to 0xe897dff8) >> dfa0: 00000000 00000000 00000000 00000000 >> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 >> irq event stamp: 3865 >> hardirqs last enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4 >> hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4 >> softirqs last enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c >> softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168 >> ---[ end trace db48b455d924fec2 ]--- >> CPU4 is up >> CPU5 is up >> CPU6 is up >> CPU7 is up >> --->8--- >> >> This is a scenario that triggers the above issue: >> >> 1. system disables non-boot cpu's at the end of system suspend procedure, >> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, >> 3. early in the system resume procedure all cpus are got back to online >> state, >> 4. this in turn causes cpufreq to be initialized for the newly onlined >> cpus, >> 5. cpufreq-dt acquires all its resources (clocks, regulators) during >> ->init() callback, >> 6. getting regulator require to check its state, what in turn requires >> i2c transfer, >> 7. during system early resume stage this is not really possible. >> >> The issue is caused by cpufreq-dt driver not keeping its resources for >> the whole driver lifetime and relying that they can be always acquired >> at any system context. This problem has been observed on Samsung >> Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which >> have separate regulators for different CPU clusters. > Why don't you get similar problem during suspend? I think you can get > it when the CPUs are offlined as I2C would have gone by then. The > cpufreq or OPP core can try and run some regulator or genpd or clk > calls to disable resources, etc. Even if doesn't happen, it certainly > can. CPUfreq is suspended very early during system suspend and thus it does nothing when CPUs are being offlined. > Also at resume the cpufreq core may try to change the frequency right > from ->init() on certain cases, though not everytime and so the > problem can come despite of this series. cpufreq is still in suspended state (it is being 'resume' very late in the system resume procedure), so if driver doesn't explicitly change any opp in ->init(), then cpufreq core waits until everything is resumed. To sum up, this seems to be fine, beside the issue with regulator initialization I've addressed in this patchset. > We guarantee that the resources are available during probe but not > during resume, that's where the problem is. Yes, so I've changed cpufreq-dt to the common approach, in which the driver keeps all needed resources for the whole lifetime of the device. > @Rafael any suggestions on how to fix this ? Best regards
On 08-02-19, 09:12, Marek Szyprowski wrote: > On 2019-02-08 07:49, Viresh Kumar wrote: > > Why don't you get similar problem during suspend? I think you can get > > it when the CPUs are offlined as I2C would have gone by then. The > > cpufreq or OPP core can try and run some regulator or genpd or clk > > calls to disable resources, etc. Even if doesn't happen, it certainly > > can. > > CPUfreq is suspended very early during system suspend and thus it does > nothing when CPUs are being offlined. > > Also at resume the cpufreq core may try to change the frequency right > > from ->init() on certain cases, though not everytime and so the > > problem can come despite of this series. > > cpufreq is still in suspended state (it is being 'resume' very late in > the system resume procedure), so if driver doesn't explicitly change any > opp in ->init(), then cpufreq core waits until everything is resumed. To > sum up, this seems to be fine, beside the issue with regulator > initialization I've addressed in this patchset. Yeah, the governors are suspended very soon, but any frequency change starting from cpufreq core can still happen. There are at least two points in cpufreq_online() where we may end up changing the frequency, but that is conditional and may not be getting hit. > > We guarantee that the resources are available during probe but not > > during resume, that's where the problem is. > > Yes, so I've changed cpufreq-dt to the common approach, in which the > driver keeps all needed resources for the whole lifetime of the device. That's not what I was saying actually. I was saying that it should be fine to do a I2C transfer during resume, else we will always have problems and have to fix them with hacks like the one you proposed where you acquire resources for all the possible CPUs. Maybe we can fix it once and for all.
Hi Viresh, On 2019-02-08 09:55, Viresh Kumar wrote: > On 08-02-19, 09:12, Marek Szyprowski wrote: >> On 2019-02-08 07:49, Viresh Kumar wrote: >>> Why don't you get similar problem during suspend? I think you can get >>> it when the CPUs are offlined as I2C would have gone by then. The >>> cpufreq or OPP core can try and run some regulator or genpd or clk >>> calls to disable resources, etc. Even if doesn't happen, it certainly >>> can. >> CPUfreq is suspended very early during system suspend and thus it does >> nothing when CPUs are being offlined. >>> Also at resume the cpufreq core may try to change the frequency right >>> from ->init() on certain cases, though not everytime and so the >>> problem can come despite of this series. >> cpufreq is still in suspended state (it is being 'resume' very late in >> the system resume procedure), so if driver doesn't explicitly change any >> opp in ->init(), then cpufreq core waits until everything is resumed. To >> sum up, this seems to be fine, beside the issue with regulator >> initialization I've addressed in this patchset. > Yeah, the governors are suspended very soon, but any frequency change > starting from cpufreq core can still happen. There are at least two > points in cpufreq_online() where we may end up changing the frequency, > but that is conditional and may not be getting hit. Then probably cpufreq core suspend should handle this. >>> We guarantee that the resources are available during probe but not >>> during resume, that's where the problem is. >> Yes, so I've changed cpufreq-dt to the common approach, in which the >> driver keeps all needed resources for the whole lifetime of the device. > That's not what I was saying actually. I was saying that it should be > fine to do a I2C transfer during resume, else we will always have > problems and have to fix them with hacks like the one you proposed > where you acquire resources for all the possible CPUs. Maybe we can > fix it once and for all. It is fine to do i2c transfers during cpufreq resume (see drivers/base/power/main.c dpm_resume() function for exact call place). The problem is that such calls are not allowed in ->init(), which might be called very early from CPU hotplug path (CPUs are resumed in the first step of system resume procedure). What's wrong with my proposed fix? It is not that uncommon to gather all resources in probe() and keep them until remove() happens. Best regards
On 08-02-19, 10:15, Marek Szyprowski wrote: > On 2019-02-08 09:55, Viresh Kumar wrote: > > On 08-02-19, 09:12, Marek Szyprowski wrote: > >> On 2019-02-08 07:49, Viresh Kumar wrote: > >>> Why don't you get similar problem during suspend? I think you can get > >>> it when the CPUs are offlined as I2C would have gone by then. The > >>> cpufreq or OPP core can try and run some regulator or genpd or clk > >>> calls to disable resources, etc. Even if doesn't happen, it certainly > >>> can. > >> CPUfreq is suspended very early during system suspend and thus it does > >> nothing when CPUs are being offlined. > >>> Also at resume the cpufreq core may try to change the frequency right > >>> from ->init() on certain cases, though not everytime and so the > >>> problem can come despite of this series. > >> cpufreq is still in suspended state (it is being 'resume' very late in > >> the system resume procedure), so if driver doesn't explicitly change any > >> opp in ->init(), then cpufreq core waits until everything is resumed. To > >> sum up, this seems to be fine, beside the issue with regulator > >> initialization I've addressed in this patchset. > > Yeah, the governors are suspended very soon, but any frequency change > > starting from cpufreq core can still happen. There are at least two > > points in cpufreq_online() where we may end up changing the frequency, > > but that is conditional and may not be getting hit. > > Then probably cpufreq core suspend should handle this. Handle what ? That code is part of cpufreq_online() and needs to be there only. > >>> We guarantee that the resources are available during probe but not > >>> during resume, that's where the problem is. > >> Yes, so I've changed cpufreq-dt to the common approach, in which the > >> driver keeps all needed resources for the whole lifetime of the device. > > That's not what I was saying actually. I was saying that it should be > > fine to do a I2C transfer during resume, else we will always have > > problems and have to fix them with hacks like the one you proposed > > where you acquire resources for all the possible CPUs. Maybe we can > > fix it once and for all. > > It is fine to do i2c transfers during cpufreq resume (see By resume I meant system resume and the whole onlining process of non-boot CPUs. > drivers/base/power/main.c dpm_resume() function for exact call place). > The problem is that such calls are not allowed in ->init(), which might > be called very early from CPU hotplug path (CPUs are resumed in the > first step of system resume procedure). Right and that's where I think we can do something to fix it in a proper way. > What's wrong with my proposed fix? It is not that uncommon to gather all > resources in probe() and keep them until remove() happens. For cpufreq drivers, we must be doing most of the stuff in init/exit only as far as possible. I am not saying your patch is bad, that is the best we can do in such situations. But I don't like that we have to get the resources for all CPUs, despite the fact that many of them would be part of the same policy and hence share resources. The problem was that we need to get sharing-cpus detail as well in probe then, etc. I was thinking about doing disable_nonboot_cpus() much earlier as the userspace is already frozen by then. @Rafael: Will that slowdown the suspend/resume process? Maybe not as we are doing everything from a single CPU/thread anyways ?
Hi Viresh, On 2019-02-08 10:23, Viresh Kumar wrote: > On 08-02-19, 10:15, Marek Szyprowski wrote: >> On 2019-02-08 09:55, Viresh Kumar wrote: >>> On 08-02-19, 09:12, Marek Szyprowski wrote: >>>> On 2019-02-08 07:49, Viresh Kumar wrote: >>>>> Why don't you get similar problem during suspend? I think you can get >>>>> it when the CPUs are offlined as I2C would have gone by then. The >>>>> cpufreq or OPP core can try and run some regulator or genpd or clk >>>>> calls to disable resources, etc. Even if doesn't happen, it certainly >>>>> can. >>>> CPUfreq is suspended very early during system suspend and thus it does >>>> nothing when CPUs are being offlined. >>>>> Also at resume the cpufreq core may try to change the frequency right >>>>> from ->init() on certain cases, though not everytime and so the >>>>> problem can come despite of this series. >>>> cpufreq is still in suspended state (it is being 'resume' very late in >>>> the system resume procedure), so if driver doesn't explicitly change any >>>> opp in ->init(), then cpufreq core waits until everything is resumed. To >>>> sum up, this seems to be fine, beside the issue with regulator >>>> initialization I've addressed in this patchset. >>> Yeah, the governors are suspended very soon, but any frequency change >>> starting from cpufreq core can still happen. There are at least two >>> points in cpufreq_online() where we may end up changing the frequency, >>> but that is conditional and may not be getting hit. >> Then probably cpufreq core suspend should handle this. > Handle what ? That code is part of cpufreq_online() and needs to be > there only. If got it right, it is a matter of handling CPUFREQ_NEED_INITIAL_FREQ_CHECK flag. Maybe there should be additional check if CPUfreq is not suspended? >>>>> We guarantee that the resources are available during probe but not >>>>> during resume, that's where the problem is. >>>> Yes, so I've changed cpufreq-dt to the common approach, in which the >>>> driver keeps all needed resources for the whole lifetime of the device. >>> That's not what I was saying actually. I was saying that it should be >>> fine to do a I2C transfer during resume, else we will always have >>> problems and have to fix them with hacks like the one you proposed >>> where you acquire resources for all the possible CPUs. Maybe we can >>> fix it once and for all. >> It is fine to do i2c transfers during cpufreq resume (see > By resume I meant system resume and the whole onlining process of > non-boot CPUs. Right now those are 2 separate things in cpufreq core. >> drivers/base/power/main.c dpm_resume() function for exact call place). >> The problem is that such calls are not allowed in ->init(), which might >> be called very early from CPU hotplug path (CPUs are resumed in the >> first step of system resume procedure). > Right and that's where I think we can do something to fix it in a > proper way. > >> What's wrong with my proposed fix? It is not that uncommon to gather all >> resources in probe() and keep them until remove() happens. > For cpufreq drivers, we must be doing most of the stuff in init/exit > only as far as possible. I am not saying your patch is bad, that is > the best we can do in such situations. But I don't like that we have > to get the resources for all CPUs, despite the fact that many of them > would be part of the same policy and hence share resources. The > problem was that we need to get sharing-cpus detail as well in probe > then, etc. Both resources in this case: clocks and regulators are refcounted by their frameworks, so the cost of getting a few more references for them is imho negligible. > I was thinking about doing disable_nonboot_cpus() much earlier as the > userspace is already frozen by then. > > @Rafael: Will that slowdown the suspend/resume process? Maybe not as > we are doing everything from a single CPU/thread anyways ? For some reasons drivers are handled partially asynchronously in suspend/resume procedure and can do suspend and resume in parallel. I don't think that limiting the whole suspend/resume process to a single cpu is the best we can do. Best regards
On Fri, Feb 8, 2019 at 10:23 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 08-02-19, 10:15, Marek Szyprowski wrote: > > On 2019-02-08 09:55, Viresh Kumar wrote: > > > On 08-02-19, 09:12, Marek Szyprowski wrote: > > >> On 2019-02-08 07:49, Viresh Kumar wrote: > > >>> Why don't you get similar problem during suspend? I think you can get > > >>> it when the CPUs are offlined as I2C would have gone by then. The > > >>> cpufreq or OPP core can try and run some regulator or genpd or clk > > >>> calls to disable resources, etc. Even if doesn't happen, it certainly > > >>> can. > > >> CPUfreq is suspended very early during system suspend and thus it does > > >> nothing when CPUs are being offlined. > > >>> Also at resume the cpufreq core may try to change the frequency right > > >>> from ->init() on certain cases, though not everytime and so the > > >>> problem can come despite of this series. > > >> cpufreq is still in suspended state (it is being 'resume' very late in > > >> the system resume procedure), so if driver doesn't explicitly change any > > >> opp in ->init(), then cpufreq core waits until everything is resumed. To > > >> sum up, this seems to be fine, beside the issue with regulator > > >> initialization I've addressed in this patchset. > > > Yeah, the governors are suspended very soon, but any frequency change > > > starting from cpufreq core can still happen. There are at least two > > > points in cpufreq_online() where we may end up changing the frequency, > > > but that is conditional and may not be getting hit. > > > > Then probably cpufreq core suspend should handle this. > > Handle what ? That code is part of cpufreq_online() and needs to be > there only. > > > >>> We guarantee that the resources are available during probe but not > > >>> during resume, that's where the problem is. > > >> Yes, so I've changed cpufreq-dt to the common approach, in which the > > >> driver keeps all needed resources for the whole lifetime of the device. > > > That's not what I was saying actually. I was saying that it should be > > > fine to do a I2C transfer during resume, else we will always have > > > problems and have to fix them with hacks like the one you proposed > > > where you acquire resources for all the possible CPUs. Maybe we can > > > fix it once and for all. > > > > It is fine to do i2c transfers during cpufreq resume (see > > By resume I meant system resume and the whole onlining process of > non-boot CPUs. > > > drivers/base/power/main.c dpm_resume() function for exact call place). > > The problem is that such calls are not allowed in ->init(), which might > > be called very early from CPU hotplug path (CPUs are resumed in the > > first step of system resume procedure). > > Right and that's where I think we can do something to fix it in a > proper way. > > > What's wrong with my proposed fix? It is not that uncommon to gather all > > resources in probe() and keep them until remove() happens. > > For cpufreq drivers, we must be doing most of the stuff in init/exit > only as far as possible. I am not saying your patch is bad, that is > the best we can do in such situations. But I don't like that we have > to get the resources for all CPUs, despite the fact that many of them > would be part of the same policy and hence share resources. The > problem was that we need to get sharing-cpus detail as well in probe > then, etc. > > I was thinking about doing disable_nonboot_cpus() much earlier as the > userspace is already frozen by then. > > @Rafael: Will that slowdown the suspend/resume process? Maybe not as > we are doing everything from a single CPU/thread anyways ? First, we used to do that and we switched over to what we have right now several years ago, because it didn't work reliably then. Arguably, CPU hotplug is in a much better shape now, so it might be working better, but that would be a huge change and lots of documentation would need to be revised. :-) Also it is not true that everything is done on a single CPU, but I'm not really sure about the possible slowdown. Second, there are (many) systems for which that change is not really necessary and it is risky because of possible regressions. I guess that CPUs depending on I2C for online/offline could be taken offline earlier and brought back online later during suspend/resume, like before/after the _noirq suspend of devices, but doing that on all systems altogether is almost guaranteed to introduce regressions.
On Fri, Feb 8, 2019 at 9:55 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 08-02-19, 09:12, Marek Szyprowski wrote: > > On 2019-02-08 07:49, Viresh Kumar wrote: > > > Why don't you get similar problem during suspend? I think you can get > > > it when the CPUs are offlined as I2C would have gone by then. The > > > cpufreq or OPP core can try and run some regulator or genpd or clk > > > calls to disable resources, etc. Even if doesn't happen, it certainly > > > can. > > > > CPUfreq is suspended very early during system suspend and thus it does > > nothing when CPUs are being offlined. > > > > Also at resume the cpufreq core may try to change the frequency right > > > from ->init() on certain cases, though not everytime and so the > > > problem can come despite of this series. > > > > cpufreq is still in suspended state (it is being 'resume' very late in > > the system resume procedure), so if driver doesn't explicitly change any > > opp in ->init(), then cpufreq core waits until everything is resumed. To > > sum up, this seems to be fine, beside the issue with regulator > > initialization I've addressed in this patchset. > > Yeah, the governors are suspended very soon, but any frequency change > starting from cpufreq core can still happen. There are at least two > points in cpufreq_online() where we may end up changing the frequency, > but that is conditional and may not be getting hit. > > > > We guarantee that the resources are available during probe but not > > > during resume, that's where the problem is. > > > > Yes, so I've changed cpufreq-dt to the common approach, in which the > > driver keeps all needed resources for the whole lifetime of the device. > > That's not what I was saying actually. I was saying that it should be > fine to do a I2C transfer during resume, Surely not before resuming the I2C controller involved? > else we will always have > problems and have to fix them with hacks like the one you proposed > where you acquire resources for all the possible CPUs. Maybe we can > fix it once and for all. Obviously, all I2C transfers need to take place either before suspending the I2C controller or after resuming it.
On Fri, Feb 8, 2019 at 7:50 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 07-02-19, 13:22, Marek Szyprowski wrote: > > Dear All, > > > > Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for > > i2c adapters") added a visible warning for an attempt to do i2c transfer > > over a suspended i2c bus. This revealed a long standing issue in the > > cpufreq-dt driver, which gives a following warning during system > > suspend/resume cycle: > > > > --->8--- > > Enabling non-boot CPUs ... > > CPU1 is up > > CPU2 is up > > CPU3 is up > > ------------[ cut here ]------------ > > WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50 > > Modules linked in: > > CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G W 5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324 > > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > > [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14) > > [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8) > > [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124) > > [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48) > > [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50) > > [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4) > > [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64) > > [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450) > > [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68) > > [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250) > > [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c) > > [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90) > > [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40) > > [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c) > > [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278) > > [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0) > > [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0) > > [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c) > > [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10) > > [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc) > > [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320) > > [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340) > > [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160) > > [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > > Exception stack(0xe897dfb0 to 0xe897dff8) > > dfa0: 00000000 00000000 00000000 00000000 > > dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > > dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > irq event stamp: 3865 > > hardirqs last enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4 > > hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4 > > softirqs last enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c > > softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168 > > ---[ end trace db48b455d924fec2 ]--- > > CPU4 is up > > CPU5 is up > > CPU6 is up > > CPU7 is up > > --->8--- > > > > This is a scenario that triggers the above issue: > > > > 1. system disables non-boot cpu's at the end of system suspend procedure, > > 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > > 3. early in the system resume procedure all cpus are got back to online > > state, > > 4. this in turn causes cpufreq to be initialized for the newly onlined > > cpus, > > 5. cpufreq-dt acquires all its resources (clocks, regulators) during > > ->init() callback, > > 6. getting regulator require to check its state, what in turn requires > > i2c transfer, > > 7. during system early resume stage this is not really possible. > > > > The issue is caused by cpufreq-dt driver not keeping its resources for > > the whole driver lifetime and relying that they can be always acquired > > at any system context. This problem has been observed on Samsung > > Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which > > have separate regulators for different CPU clusters. > > Why don't you get similar problem during suspend? I think you can get > it when the CPUs are offlined as I2C would have gone by then. The > cpufreq or OPP core can try and run some regulator or genpd or clk > calls to disable resources, etc. Even if doesn't happen, it certainly > can. > > Also at resume the cpufreq core may try to change the frequency right > from ->init() on certain cases, though not everytime and so the > problem can come despite of this series. > > We guarantee that the resources are available during probe but not > during resume, that's where the problem is. > > @Rafael any suggestions on how to fix this ? There are cpufreq driver suspend and resume callbacks, maybe use them? The driver could do the I2C transactions in its suspend/resume callbacks and do nothing in online/offline if those are part of system-wide suspend/resume.
On 08-02-19, 11:18, Rafael J. Wysocki wrote: > Obviously, all I2C transfers need to take place either before > suspending the I2C controller or after resuming it. Right. But all I am saying is that it is the responsibility of the cpufreq core/driver to make sure we call ->init() only when the resources are available to be used. For example during driver load, we defer probe if resources aren't available and that makes sure ->init() gets called when we can actually change the frequency. Shouldn't the same be true during resume? That is make sure all resources are in place before ->init() gets called ?
Hi Rafael, On 2019-02-08 11:22, Rafael J. Wysocki wrote: > On Fri, Feb 8, 2019 at 7:50 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: >> On 07-02-19, 13:22, Marek Szyprowski wrote: >>> Dear All, >>> >>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for >>> i2c adapters") added a visible warning for an attempt to do i2c transfer >>> over a suspended i2c bus. This revealed a long standing issue in the >>> cpufreq-dt driver, which gives a following warning during system >>> suspend/resume cycle: >>> >>> --->8--- >>> Enabling non-boot CPUs ... >>> CPU1 is up >>> CPU2 is up >>> CPU3 is up >>> ------------[ cut here ]------------ >>> WARNING: CPU: 4 PID: 29 at drivers/i2c/i2c-core-base.c:1869 __i2c_transfer+0x6f8/0xa50 >>> Modules linked in: >>> CPU: 4 PID: 29 Comm: cpuhp/4 Tainted: G W 5.0.0-rc4-next-20190131-00024-g54b06b29cc65 #5324 >>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>> [<c01110e8>] (unwind_backtrace) from [<c010d11c>] (show_stack+0x10/0x14) >>> [<c010d11c>] (show_stack) from [<c09a2584>] (dump_stack+0x90/0xc8) >>> [<c09a2584>] (dump_stack) from [<c0120bd0>] (__warn+0xf8/0x124) >>> [<c0120bd0>] (__warn) from [<c0120c3c>] (warn_slowpath_null+0x40/0x48) >>> [<c0120c3c>] (warn_slowpath_null) from [<c065cda0>] (__i2c_transfer+0x6f8/0xa50) >>> [<c065cda0>] (__i2c_transfer) from [<c065d168>] (i2c_transfer+0x70/0xe4) >>> [<c065d168>] (i2c_transfer) from [<c053ce4c>] (regmap_i2c_read+0x48/0x64) >>> [<c053ce4c>] (regmap_i2c_read) from [<c0536f1c>] (_regmap_raw_read+0xf8/0x450) >>> [<c0536f1c>] (_regmap_raw_read) from [<c053772c>] (_regmap_bus_read+0x38/0x68) >>> [<c053772c>] (_regmap_bus_read) from [<c05365a0>] (_regmap_read+0x60/0x250) >>> [<c05365a0>] (_regmap_read) from [<c05367cc>] (regmap_read+0x3c/0x5c) >>> [<c05367cc>] (regmap_read) from [<c047cfc0>] (regulator_is_enabled_regmap+0x20/0x90) >>> [<c047cfc0>] (regulator_is_enabled_regmap) from [<c0477660>] (_regulator_is_enabled+0x34/0x40) >>> [<c0477660>] (_regulator_is_enabled) from [<c0478674>] (create_regulator+0x1a4/0x25c) >>> [<c0478674>] (create_regulator) from [<c047c818>] (_regulator_get+0xe4/0x278) >>> [<c047c818>] (_regulator_get) from [<c068f1dc>] (dev_pm_opp_set_regulators+0xa0/0x1c0) >>> [<c068f1dc>] (dev_pm_opp_set_regulators) from [<c0698cc8>] (cpufreq_init+0x98/0x2d0) >>> [<c0698cc8>] (cpufreq_init) from [<c06959e4>] (cpufreq_online+0xc8/0x71c) >>> [<c06959e4>] (cpufreq_online) from [<c06960fc>] (cpuhp_cpufreq_online+0x8/0x10) >>> [<c06960fc>] (cpuhp_cpufreq_online) from [<c01213d4>] (cpuhp_invoke_callback+0xf4/0xebc) >>> [<c01213d4>] (cpuhp_invoke_callback) from [<c0122e4c>] (cpuhp_thread_fun+0x1d8/0x320) >>> [<c0122e4c>] (cpuhp_thread_fun) from [<c0149858>] (smpboot_thread_fn+0x194/0x340) >>> [<c0149858>] (smpboot_thread_fn) from [<c014573c>] (kthread+0x124/0x160) >>> [<c014573c>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) >>> Exception stack(0xe897dfb0 to 0xe897dff8) >>> dfa0: 00000000 00000000 00000000 00000000 >>> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 >>> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 >>> irq event stamp: 3865 >>> hardirqs last enabled at (3873): [<c0186dec>] vprintk_emit+0x228/0x2a4 >>> hardirqs last disabled at (3880): [<c0186cf0>] vprintk_emit+0x12c/0x2a4 >>> softirqs last enabled at (3052): [<c0102564>] __do_softirq+0x3a4/0x66c >>> softirqs last disabled at (3043): [<c0128464>] irq_exit+0x140/0x168 >>> ---[ end trace db48b455d924fec2 ]--- >>> CPU4 is up >>> CPU5 is up >>> CPU6 is up >>> CPU7 is up >>> --->8--- >>> >>> This is a scenario that triggers the above issue: >>> >>> 1. system disables non-boot cpu's at the end of system suspend procedure, >>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, >>> 3. early in the system resume procedure all cpus are got back to online >>> state, >>> 4. this in turn causes cpufreq to be initialized for the newly onlined >>> cpus, >>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during >>> ->init() callback, >>> 6. getting regulator require to check its state, what in turn requires >>> i2c transfer, >>> 7. during system early resume stage this is not really possible. >>> >>> The issue is caused by cpufreq-dt driver not keeping its resources for >>> the whole driver lifetime and relying that they can be always acquired >>> at any system context. This problem has been observed on Samsung >>> Exynos based Odroid XU3/XU4 boards, but it happens on all boards, which >>> have separate regulators for different CPU clusters. >> Why don't you get similar problem during suspend? I think you can get >> it when the CPUs are offlined as I2C would have gone by then. The >> cpufreq or OPP core can try and run some regulator or genpd or clk >> calls to disable resources, etc. Even if doesn't happen, it certainly >> can. >> >> Also at resume the cpufreq core may try to change the frequency right >> from ->init() on certain cases, though not everytime and so the >> problem can come despite of this series. >> >> We guarantee that the resources are available during probe but not >> during resume, that's where the problem is. >> >> @Rafael any suggestions on how to fix this ? > There are cpufreq driver suspend and resume callbacks, maybe use them? > > The driver could do the I2C transactions in its suspend/resume > callbacks and do nothing in online/offline if those are part of > system-wide suspend/resume. This is exactly what I suggested, when I wrote to handle it in cpufreq suspend/resume. Best regards
On 08-02-19, 11:22, Rafael J. Wysocki wrote: > There are cpufreq driver suspend and resume callbacks, maybe use them? > > The driver could do the I2C transactions in its suspend/resume > callbacks and do nothing in online/offline if those are part of > system-wide suspend/resume. These are per-policy things that we need to do, not sure if driver suspend/resume is a good place for that. It is more for a case where CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are hot-unplugged during system suspend and hotplugged later on. This is more like complete removal/addition of devices instead of suspend/resume.
On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > The driver could do the I2C transactions in its suspend/resume > > callbacks and do nothing in online/offline if those are part of > > system-wide suspend/resume. > > These are per-policy things that we need to do, not sure if driver > suspend/resume is a good place for that. It is more for a case where > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > hot-unplugged during system suspend and hotplugged later on. This is > more like complete removal/addition of devices instead of > suspend/resume. No, it isn't. We don't remove devices on offline. We migrate stuff away from them and (opportunistically) power them down. If this is system suspend, the driver kind of knows that offline will take place, so it can prepare for it. Likewise, when online takes place during system-wide resume, it generally is known that this is system-wide resume (there is a flag to indicate that in CPU hotplug), it can be "smart" and avoid accessing suspended devices. Deferring the frequency set up until the driver resume time should do the trick I suppose.
On Fri, Feb 8, 2019 at 11:42 AM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > > > The driver could do the I2C transactions in its suspend/resume > > > callbacks and do nothing in online/offline if those are part of > > > system-wide suspend/resume. > > > > These are per-policy things that we need to do, not sure if driver > > suspend/resume is a good place for that. It is more for a case where > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > > hot-unplugged during system suspend and hotplugged later on. This is > > more like complete removal/addition of devices instead of > > suspend/resume. > > No, it isn't. We don't remove devices on offline. We migrate stuff > away from them and (opportunistically) power them down. > > If this is system suspend, the driver kind of knows that offline will > take place, so it can prepare for it. Likewise, when online takes > place during system-wide resume, it generally is known that this is > system-wide resume (there is a flag to indicate that in CPU hotplug), > it can be "smart" and avoid accessing suspended devices. Deferring > the frequency set up until the driver resume time should do the trick > I suppose. BTW, do transitions to idle states involve I2C transactions on the systems in question?
On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: > Dear All, > > This is a scenario that triggers the above issue: > [...] > 1. system disables non-boot cpu's at the end of system suspend procedure, > 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > 3. early in the system resume procedure all cpus are got back to online > state, > 4. this in turn causes cpufreq to be initialized for the newly onlined > cpus, > 5. cpufreq-dt acquires all its resources (clocks, regulators) during > ->init() callback, This is strictly not just restricted to cpufreq-dt, but to any driver supporting multiple policies. So we need a generic fix not just cpufreq-dt specific. -- Regards, Sudeep
On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote: > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > > > The driver could do the I2C transactions in its suspend/resume > > > callbacks and do nothing in online/offline if those are part of > > > system-wide suspend/resume. > > > > These are per-policy things that we need to do, not sure if driver > > suspend/resume is a good place for that. It is more for a case where > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > > hot-unplugged during system suspend and hotplugged later on. This is > > more like complete removal/addition of devices instead of > > suspend/resume. > > No, it isn't. We don't remove devices on offline. We migrate stuff > away from them and (opportunistically) power them down. > > If this is system suspend, the driver kind of knows that offline will > take place, so it can prepare for it. Likewise, when online takes > place during system-wide resume, it generally is known that this is > system-wide resume (there is a flag to indicate that in CPU hotplug), > it can be "smart" and avoid accessing suspended devices. Deferring > the frequency set up until the driver resume time should do the trick > I suppose. I agree. The reason we don't see this generally on boot is because all the CPUs are brought online before CPUfreq is initialised. While during system suspend, we call cpufreq_online which in turn calls ->init in the hotplug state machine. So as Rafael suggests we need to do some trick, but can it be done in the core itself ? I may be missing something, but how about the patch below: Regards, Sudeep -- diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c index e35a886e00bc..7d8b0b99f91d 100644 --- i/drivers/cpufreq/cpufreq.c +++ w/drivers/cpufreq/cpufreq.c @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu) policy->max = policy->user_policy.max; } - if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { + if (cpufreq_driver->get && !cpufreq_driver->setpolicy && + !cpufreq_suspended) { policy->cur = cpufreq_driver->get(policy->cpu); if (!policy->cur) { pr_err("%s: ->get() failed\n", __func__); @@ -1702,6 +1703,11 @@ void cpufreq_resume(void) pr_err("%s: Failed to start governor for policy: %p\n", __func__, policy); } + policy->cur = cpufreq_driver->get(policy->cpu); + if (!policy->cur) { + pr_err("%s: ->get() failed\n", __func__); + goto out_destroy_policy; + } } }
Hi Sudeep, On 2019-02-08 12:00, Sudeep Holla wrote: > On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: >> Dear All, >> >> This is a scenario that triggers the above issue: > [...] >> 1. system disables non-boot cpu's at the end of system suspend procedure, >> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, >> 3. early in the system resume procedure all cpus are got back to online >> state, >> 4. this in turn causes cpufreq to be initialized for the newly onlined >> cpus, >> 5. cpufreq-dt acquires all its resources (clocks, regulators) during >> ->init() callback, > This is strictly not just restricted to cpufreq-dt, but to any driver > supporting multiple policies. So we need a generic fix not just > cpufreq-dt specific. Could you point which other driver needs similar fix? Here in cpufreq-dt the problem was caused by using regulator api (indirectly) from ->init(). All other drivers, which have regulators support, are for old, obsolete, uni-processor systems, which don't have the problem of secondary cpu suspend during system suspend/resume cycle. Best regards
On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote: > Hi Sudeep, > > On 2019-02-08 12:00, Sudeep Holla wrote: > > On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: > >> Dear All, > >> > >> This is a scenario that triggers the above issue: > > [...] > >> 1. system disables non-boot cpu's at the end of system suspend procedure, > >> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > >> 3. early in the system resume procedure all cpus are got back to online > >> state, > >> 4. this in turn causes cpufreq to be initialized for the newly onlined > >> cpus, > >> 5. cpufreq-dt acquires all its resources (clocks, regulators) during > >> ->init() callback, > > This is strictly not just restricted to cpufreq-dt, but to any driver > > supporting multiple policies. So we need a generic fix not just > > cpufreq-dt specific. > > Could you point which other driver needs similar fix? Here in cpufreq-dt > the problem was caused by using regulator api (indirectly) from > ->init(). All other drivers, which have regulators support, are for old, > obsolete, uni-processor systems, which don't have the problem of > secondary cpu suspend during system suspend/resume cycle. > scmi_cpufreq for instance. We can fix that in driver my moving to polling to get cpufreq_get_rate, but we support both polling and interrupt based. We may wait for remote processor interrupt in get_rate. -- Regards, Sudeep
On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote: > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > > > > > The driver could do the I2C transactions in its suspend/resume > > > > callbacks and do nothing in online/offline if those are part of > > > > system-wide suspend/resume. > > > > > > These are per-policy things that we need to do, not sure if driver > > > suspend/resume is a good place for that. It is more for a case where > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > > > hot-unplugged during system suspend and hotplugged later on. This is > > > more like complete removal/addition of devices instead of > > > suspend/resume. > > > > No, it isn't. We don't remove devices on offline. We migrate stuff > > away from them and (opportunistically) power them down. > > > > If this is system suspend, the driver kind of knows that offline will > > take place, so it can prepare for it. Likewise, when online takes > > place during system-wide resume, it generally is known that this is > > system-wide resume (there is a flag to indicate that in CPU hotplug), > > it can be "smart" and avoid accessing suspended devices. Deferring > > the frequency set up until the driver resume time should do the trick > > I suppose. > > I agree. The reason we don't see this generally on boot is because all > the CPUs are brought online before CPUfreq is initialised. While during > system suspend, we call cpufreq_online which in turn calls ->init in > the hotplug state machine. > > So as Rafael suggests we need to do some trick, but can it be done in > the core itself ? I may be missing something, but how about the patch > below: > > Regards, > Sudeep > > -- > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c > index e35a886e00bc..7d8b0b99f91d 100644 > --- i/drivers/cpufreq/cpufreq.c > +++ w/drivers/cpufreq/cpufreq.c > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu) > policy->max = policy->user_policy.max; > } > > - if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { > + if (cpufreq_driver->get && !cpufreq_driver->setpolicy && > + !cpufreq_suspended) { > policy->cur = cpufreq_driver->get(policy->cpu); > if (!policy->cur) { > pr_err("%s: ->get() failed\n", __func__); It looks like we need to skip the "initial freq check" block below. Also this doesn't really help the case when the driver ->init() messes up with things. > @@ -1702,6 +1703,11 @@ void cpufreq_resume(void) > pr_err("%s: Failed to start governor for policy: %p\n", > __func__, policy); > } > + policy->cur = cpufreq_driver->get(policy->cpu); > + if (!policy->cur) { > + pr_err("%s: ->get() failed\n", __func__); > + goto out_destroy_policy; > + } > } > } >
Hi Sudeep, On 2019-02-08 12:51, Sudeep Holla wrote: > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote: >> On 2019-02-08 12:00, Sudeep Holla wrote: >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: >>>> Dear All, >>>> >>>> This is a scenario that triggers the above issue: >>> [...] >>>> 1. system disables non-boot cpu's at the end of system suspend procedure, >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, >>>> 3. early in the system resume procedure all cpus are got back to online >>>> state, >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined >>>> cpus, >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during >>>> ->init() callback, >>> This is strictly not just restricted to cpufreq-dt, but to any driver >>> supporting multiple policies. So we need a generic fix not just >>> cpufreq-dt specific. >> Could you point which other driver needs similar fix? Here in cpufreq-dt >> the problem was caused by using regulator api (indirectly) from >> ->init(). All other drivers, which have regulators support, are for old, >> obsolete, uni-processor systems, which don't have the problem of >> secondary cpu suspend during system suspend/resume cycle. >> > scmi_cpufreq for instance. We can fix that in driver my moving to polling > to get cpufreq_get_rate, but we support both polling and interrupt based. > We may wait for remote processor interrupt in get_rate. Frankly, I don't feel I know enough to touch this driver and I don't think that this can even be fixed in a generic way in the cpufreq core. Maybe a comment somewhere is needed that ->init() might be called during early system resume with irqs off and driver is responsible for handling such case until the proper resume? Best regards
On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote: > On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote: > > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > > > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > > > > > > > The driver could do the I2C transactions in its suspend/resume > > > > > callbacks and do nothing in online/offline if those are part of > > > > > system-wide suspend/resume. > > > > > > > > These are per-policy things that we need to do, not sure if driver > > > > suspend/resume is a good place for that. It is more for a case where > > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > > > > hot-unplugged during system suspend and hotplugged later on. This is > > > > more like complete removal/addition of devices instead of > > > > suspend/resume. > > > > > > No, it isn't. We don't remove devices on offline. We migrate stuff > > > away from them and (opportunistically) power them down. > > > > > > If this is system suspend, the driver kind of knows that offline will > > > take place, so it can prepare for it. Likewise, when online takes > > > place during system-wide resume, it generally is known that this is > > > system-wide resume (there is a flag to indicate that in CPU hotplug), > > > it can be "smart" and avoid accessing suspended devices. Deferring > > > the frequency set up until the driver resume time should do the trick > > > I suppose. > > > > I agree. The reason we don't see this generally on boot is because all > > the CPUs are brought online before CPUfreq is initialised. While during > > system suspend, we call cpufreq_online which in turn calls ->init in > > the hotplug state machine. > > > > So as Rafael suggests we need to do some trick, but can it be done in > > the core itself ? I may be missing something, but how about the patch > > below: > > > > Regards, > > Sudeep > > > > -- > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c > > index e35a886e00bc..7d8b0b99f91d 100644 > > --- i/drivers/cpufreq/cpufreq.c > > +++ w/drivers/cpufreq/cpufreq.c > > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu) > > policy->max = policy->user_policy.max; > > } > > > > - if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { > > + if (cpufreq_driver->get && !cpufreq_driver->setpolicy && > > + !cpufreq_suspended) { > > policy->cur = cpufreq_driver->get(policy->cpu); > > if (!policy->cur) { > > pr_err("%s: ->get() failed\n", __func__); > > It looks like we need to skip the "initial freq check" block below. > Indeed, copy pasted an earlier version of diff. I found that I even used a goto label wrong which I fixed along with the additional check in "initial freq check" when I tried to compile :). > Also this doesn't really help the case when the driver ->init() messes > up with things. > Yes, in that case additional logic in the driver also needed. I am fine if we enforce driver to deal with this issue, but was thinking if we can make it generic. Also I was just trying to avoid adding _suspend/resume to driver just to avoid this issue. -- Regards, Sudeep
On Fri, Feb 8, 2019 at 1:04 PM Marek Szyprowski <m.szyprowski@samsung.com> wrote: > > Hi Sudeep, > > On 2019-02-08 12:51, Sudeep Holla wrote: > > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote: > >> On 2019-02-08 12:00, Sudeep Holla wrote: > >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: > >>>> Dear All, > >>>> > >>>> This is a scenario that triggers the above issue: > >>> [...] > >>>> 1. system disables non-boot cpu's at the end of system suspend procedure, > >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > >>>> 3. early in the system resume procedure all cpus are got back to online > >>>> state, > >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined > >>>> cpus, > >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during > >>>> ->init() callback, > >>> This is strictly not just restricted to cpufreq-dt, but to any driver > >>> supporting multiple policies. So we need a generic fix not just > >>> cpufreq-dt specific. > >> Could you point which other driver needs similar fix? Here in cpufreq-dt > >> the problem was caused by using regulator api (indirectly) from > >> ->init(). All other drivers, which have regulators support, are for old, > >> obsolete, uni-processor systems, which don't have the problem of > >> secondary cpu suspend during system suspend/resume cycle. > >> > > scmi_cpufreq for instance. We can fix that in driver my moving to polling > > to get cpufreq_get_rate, but we support both polling and interrupt based. > > We may wait for remote processor interrupt in get_rate. > > Frankly, I don't feel I know enough to touch this driver and I don't > think that this can even be fixed in a generic way in the cpufreq core. > Maybe a comment somewhere is needed that ->init() might be called during > early system resume with irqs off and driver is responsible for handling > such case until the proper resume? Well, adding a comment to that effect certainly won't hurt as that's how things work now. However, it looks like something needs to be done in addition to that.
On Fri, Feb 08, 2019 at 01:04:18PM +0100, Marek Szyprowski wrote: > Hi Sudeep, > > On 2019-02-08 12:51, Sudeep Holla wrote: > > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote: > >> On 2019-02-08 12:00, Sudeep Holla wrote: > >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: > >>>> Dear All, > >>>> > >>>> This is a scenario that triggers the above issue: > >>> [...] > >>>> 1. system disables non-boot cpu's at the end of system suspend procedure, > >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > >>>> 3. early in the system resume procedure all cpus are got back to online > >>>> state, > >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined > >>>> cpus, > >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during > >>>> ->init() callback, > >>> This is strictly not just restricted to cpufreq-dt, but to any driver > >>> supporting multiple policies. So we need a generic fix not just > >>> cpufreq-dt specific. > >> Could you point which other driver needs similar fix? Here in cpufreq-dt > >> the problem was caused by using regulator api (indirectly) from > >> ->init(). All other drivers, which have regulators support, are for old, > >> obsolete, uni-processor systems, which don't have the problem of > >> secondary cpu suspend during system suspend/resume cycle. > >> > > scmi_cpufreq for instance. We can fix that in driver my moving to polling > > to get cpufreq_get_rate, but we support both polling and interrupt based. > > We may wait for remote processor interrupt in get_rate. > > Frankly, I don't feel I know enough to touch this driver and I don't > think that this can even be fixed in a generic way in the cpufreq core. Why not ? IIUC it's only to get/set the frequency you would need to talk to remote processor or external chip over I2C which can be deferred until resume. Rafael has valid concerns on messed up init implementations, still wondering if there's any chance to solve this in the core. > Maybe a comment somewhere is needed that ->init() might be called during > early system resume with irqs off and driver is responsible for handling > such case until the proper resume? > I agree and +1 for comment, but every driver needs to deal with that, which is fine. Just trying to see if we can avoid it. -- Regards, Sudeep
On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Fri, Feb 08, 2019 at 01:03:10PM +0100, Rafael J. Wysocki wrote: > > On Fri, Feb 8, 2019 at 12:39 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > > > > > On Fri, Feb 08, 2019 at 11:42:20AM +0100, Rafael J. Wysocki wrote: > > > > On Fri, Feb 8, 2019 at 11:31 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > > > > > On 08-02-19, 11:22, Rafael J. Wysocki wrote: > > > > > > There are cpufreq driver suspend and resume callbacks, maybe use them? > > > > > > > > > > > > The driver could do the I2C transactions in its suspend/resume > > > > > > callbacks and do nothing in online/offline if those are part of > > > > > > system-wide suspend/resume. > > > > > > > > > > These are per-policy things that we need to do, not sure if driver > > > > > suspend/resume is a good place for that. It is more for a case where > > > > > CPU 0-3 are in one policy and 4-7 in another. Now 1-7 are > > > > > hot-unplugged during system suspend and hotplugged later on. This is > > > > > more like complete removal/addition of devices instead of > > > > > suspend/resume. > > > > > > > > No, it isn't. We don't remove devices on offline. We migrate stuff > > > > away from them and (opportunistically) power them down. > > > > > > > > If this is system suspend, the driver kind of knows that offline will > > > > take place, so it can prepare for it. Likewise, when online takes > > > > place during system-wide resume, it generally is known that this is > > > > system-wide resume (there is a flag to indicate that in CPU hotplug), > > > > it can be "smart" and avoid accessing suspended devices. Deferring > > > > the frequency set up until the driver resume time should do the trick > > > > I suppose. > > > > > > I agree. The reason we don't see this generally on boot is because all > > > the CPUs are brought online before CPUfreq is initialised. While during > > > system suspend, we call cpufreq_online which in turn calls ->init in > > > the hotplug state machine. > > > > > > So as Rafael suggests we need to do some trick, but can it be done in > > > the core itself ? I may be missing something, but how about the patch > > > below: > > > > > > Regards, > > > Sudeep > > > > > > -- > > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c > > > index e35a886e00bc..7d8b0b99f91d 100644 > > > --- i/drivers/cpufreq/cpufreq.c > > > +++ w/drivers/cpufreq/cpufreq.c > > > @@ -1241,7 +1241,8 @@ static int cpufreq_online(unsigned int cpu) > > > policy->max = policy->user_policy.max; > > > } > > > > > > - if (cpufreq_driver->get && !cpufreq_driver->setpolicy) { > > > + if (cpufreq_driver->get && !cpufreq_driver->setpolicy && > > > + !cpufreq_suspended) { > > > policy->cur = cpufreq_driver->get(policy->cpu); > > > if (!policy->cur) { > > > pr_err("%s: ->get() failed\n", __func__); > > > > It looks like we need to skip the "initial freq check" block below. > > > > Indeed, copy pasted an earlier version of diff. I found that I even > used a goto label wrong which I fixed along with the additional check > in "initial freq check" when I tried to compile :). > > > Also this doesn't really help the case when the driver ->init() messes > > up with things. > > > > Yes, in that case additional logic in the driver also needed. I am fine > if we enforce driver to deal with this issue, but was thinking if we can > make it generic. Also I was just trying to avoid adding _suspend/resume > to driver just to avoid this issue. I was wondering if cpufreq_offline()/online() could be invoked from cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs it (there could be a driver flag to indicate that). If they are made exit immediately when cpufreq_suspended is set (and the requisite driver flag is set too), that might work AFAICS.
On Fri, Feb 08, 2019 at 01:23:37PM +0100, Rafael J. Wysocki wrote: > On Fri, Feb 8, 2019 at 1:09 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > [...] > > Yes, in that case additional logic in the driver also needed. I am fine > > if we enforce driver to deal with this issue, but was thinking if we can > > make it generic. Also I was just trying to avoid adding _suspend/resume > > to driver just to avoid this issue. > > I was wondering if cpufreq_offline()/online() could be invoked from > cpufreq_suspend()/resume() for the nonboot CPUs - if the driver needs > it (there could be a driver flag to indicate that). > > If they are made exit immediately when cpufreq_suspended is set (and > the requisite driver flag is set too), that might work AFAICS. Yes that sounds feasible. It should be fine to assume it's safe to call cpufreq_online on a CPU even for CPU that might have failed to come online or didn't reached a state in CPUHP from where CPUFreq callback is executed or am I missing something ? -- Regards, Sudeep
On Fri, Feb 08, 2019 at 01:04:18PM +0100, Marek Szyprowski wrote: > Hi Sudeep, > > On 2019-02-08 12:51, Sudeep Holla wrote: > > On Fri, Feb 08, 2019 at 12:47:06PM +0100, Marek Szyprowski wrote: > >> On 2019-02-08 12:00, Sudeep Holla wrote: > >>> On Thu, Feb 07, 2019 at 01:22:25PM +0100, Marek Szyprowski wrote: > >>>> Dear All, > >>>> > >>>> This is a scenario that triggers the above issue: > >>> [...] > >>>> 1. system disables non-boot cpu's at the end of system suspend procedure, > >>>> 2. this in turn deinitializes cpufreq drivers for the disabled cpus, > >>>> 3. early in the system resume procedure all cpus are got back to online > >>>> state, > >>>> 4. this in turn causes cpufreq to be initialized for the newly onlined > >>>> cpus, > >>>> 5. cpufreq-dt acquires all its resources (clocks, regulators) during > >>>> ->init() callback, > >>> This is strictly not just restricted to cpufreq-dt, but to any driver > >>> supporting multiple policies. So we need a generic fix not just > >>> cpufreq-dt specific. > >> Could you point which other driver needs similar fix? Here in cpufreq-dt > >> the problem was caused by using regulator api (indirectly) from > >> ->init(). All other drivers, which have regulators support, are for old, > >> obsolete, uni-processor systems, which don't have the problem of > >> secondary cpu suspend during system suspend/resume cycle. > >> > > scmi_cpufreq for instance. We can fix that in driver my moving to polling > > to get cpufreq_get_rate, but we support both polling and interrupt based. > > We may wait for remote processor interrupt in get_rate. > > Frankly, I don't feel I know enough to touch this driver and I don't > think that this can even be fixed in a generic way in the cpufreq core. Based on Rafael's suggestion, I cooked up something. See if this helps ? The policy to cpu dance can be removed and we can just run through the online cpumask I think. Regards, Sudeep -->8 diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c index e35a886e00bc..03d65a02a542 100644 --- i/drivers/cpufreq/cpufreq.c +++ w/drivers/cpufreq/cpufreq.c @@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend); void cpufreq_suspend(void) { struct cpufreq_policy *policy; + int cpu; if (!cpufreq_driver) return; @@ -1662,6 +1663,11 @@ void cpufreq_suspend(void) } suspend: + if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME) + for_each_active_policy(policy) + for_each_cpu(cpu, policy->cpus) + cpufreq_offline(cpu); + cpufreq_suspended = true; } @@ -1674,7 +1680,7 @@ void cpufreq_suspend(void) void cpufreq_resume(void) { struct cpufreq_policy *policy; - int ret; + int ret, cpu; if (!cpufreq_driver) return; @@ -1682,6 +1688,11 @@ void cpufreq_resume(void) if (unlikely(!cpufreq_suspended)) return; + if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME) + for_each_active_policy(policy) + for_each_cpu(cpu, policy->cpus) + cpufreq_online(cpu); + cpufreq_suspended = false; if (!has_target() && !cpufreq_driver->resume) @@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online; static int cpuhp_cpufreq_online(unsigned int cpu) { - cpufreq_online(cpu); + if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)) + cpufreq_online(cpu); return 0; } static int cpuhp_cpufreq_offline(unsigned int cpu) { - cpufreq_offline(cpu); + if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)) + cpufreq_offline(cpu); return 0; } diff --git i/drivers/cpufreq/scmi-cpufreq.c w/drivers/cpufreq/scmi-cpufreq.c index 01871418ffde..0bfc96102739 100644 --- i/drivers/cpufreq/scmi-cpufreq.c +++ w/drivers/cpufreq/scmi-cpufreq.c @@ -203,7 +203,8 @@ static void scmi_cpufreq_ready(struct cpufreq_policy *policy) static struct cpufreq_driver scmi_cpufreq_driver = { .name = "scmi", .flags = CPUFREQ_STICKY | CPUFREQ_HAVE_GOVERNOR_PER_POLICY | - CPUFREQ_NEED_INITIAL_FREQ_CHECK, + CPUFREQ_NEED_INITIAL_FREQ_CHECK | + CPUFREQ_DEFER_INIT_DURING_RESUME, .verify = cpufreq_generic_frequency_table_verify, .attr = cpufreq_generic_attr, .target_index = scmi_cpufreq_set_target, diff --git i/include/linux/cpufreq.h w/include/linux/cpufreq.h index c86d6d8bdfed..9cf6b3ce063a 100644 --- i/include/linux/cpufreq.h +++ w/include/linux/cpufreq.h @@ -385,6 +385,12 @@ struct cpufreq_driver { */ #define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING (1 << 6) +/* + * Set by drivers to advance/defer the cpufreq online/offline operation during + * system suspend/resume. + */ +#define CPUFREQ_DEFER_INIT_DURING_RESUME (1 << 7) + int cpufreq_register_driver(struct cpufreq_driver *driver_data); int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
On 07-02-19, 13:22, Marek Szyprowski wrote: > Dear All, > > Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for > i2c adapters") added a visible warning for an attempt to do i2c transfer > over a suspended i2c bus. This revealed a long standing issue in the > cpufreq-dt driver, which gives a following warning during system > suspend/resume cycle: Marek, I have sent a patchset which is not directly related to the problem you are facing, but it may solve your problem as a side effect. Can you please see if that works to solve this issue as well ? https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u Thanks.
On 08-02-19, 17:41, Sudeep Holla wrote: > Based on Rafael's suggestion, I cooked up something. See if this helps ? > The policy to cpu dance can be removed and we can just run through the > online cpumask I think. > > Regards, > Sudeep > > -->8 > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c > index e35a886e00bc..03d65a02a542 100644 > --- i/drivers/cpufreq/cpufreq.c > +++ w/drivers/cpufreq/cpufreq.c > @@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend); > void cpufreq_suspend(void) > { > struct cpufreq_policy *policy; > + int cpu; > > if (!cpufreq_driver) > return; > @@ -1662,6 +1663,11 @@ void cpufreq_suspend(void) > } > > suspend: > + if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME) > + for_each_active_policy(policy) > + for_each_cpu(cpu, policy->cpus) > + cpufreq_offline(cpu); You will offline boot-cpu as well :) > + > cpufreq_suspended = true; > } > > @@ -1674,7 +1680,7 @@ void cpufreq_suspend(void) > void cpufreq_resume(void) > { > struct cpufreq_policy *policy; > - int ret; > + int ret, cpu; > > if (!cpufreq_driver) > return; > @@ -1682,6 +1688,11 @@ void cpufreq_resume(void) > if (unlikely(!cpufreq_suspended)) > return; > > + if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME) > + for_each_active_policy(policy) > + for_each_cpu(cpu, policy->cpus) > + cpufreq_online(cpu); > + > cpufreq_suspended = false; > > if (!has_target() && !cpufreq_driver->resume) > @@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online; > > static int cpuhp_cpufreq_online(unsigned int cpu) > { > - cpufreq_online(cpu); > + if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)) > + cpufreq_online(cpu); This isn't correct as we can offline the CPUs without suspend as well and cpufreq_online/offline should always be called in such cases. Anyways, I have cc'd you on another series which may end up fixing this problem as well.
Hi Viresh, On 2019-02-11 09:44, Viresh Kumar wrote: > On 07-02-19, 13:22, Marek Szyprowski wrote: >> Dear All, >> >> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for >> i2c adapters") added a visible warning for an attempt to do i2c transfer >> over a suspended i2c bus. This revealed a long standing issue in the >> cpufreq-dt driver, which gives a following warning during system >> suspend/resume cycle: > Marek, > > I have sent a patchset which is not directly related to the problem > you are facing, but it may solve your problem as a side effect. Can > you please see if that works to solve this issue as well ? > > https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u Thanks for the patch. It indeed solves the problem of the i2c transfer in cpu hotplug procedure during system resume, although my resources management rewrite is still valid as a way to fix remaining 'todos' in the cpufreq-dt driver. Best regards
On 11-02-19, 10:52, Marek Szyprowski wrote: > Hi Viresh, > > On 2019-02-11 09:44, Viresh Kumar wrote: > > On 07-02-19, 13:22, Marek Szyprowski wrote: > >> Dear All, > >> > >> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for > >> i2c adapters") added a visible warning for an attempt to do i2c transfer > >> over a suspended i2c bus. This revealed a long standing issue in the > >> cpufreq-dt driver, which gives a following warning during system > >> suspend/resume cycle: > > Marek, > > > > I have sent a patchset which is not directly related to the problem > > you are facing, but it may solve your problem as a side effect. Can > > you please see if that works to solve this issue as well ? > > > > https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u > > Thanks for the patch. It indeed solves the problem of the i2c transfer > in cpu hotplug procedure during system resume, although my resources > management rewrite is still valid as a way to fix remaining 'todos' in > the cpufreq-dt driver. Which remaining todos are you talking about ? Sorry I am not able to recall :(
Hi Viresh, On 2019-02-11 10:55, Viresh Kumar wrote: > On 11-02-19, 10:52, Marek Szyprowski wrote: >> On 2019-02-11 09:44, Viresh Kumar wrote: >>> On 07-02-19, 13:22, Marek Szyprowski wrote: >>>> Dear All, >>>> >>>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for >>>> i2c adapters") added a visible warning for an attempt to do i2c transfer >>>> over a suspended i2c bus. This revealed a long standing issue in the >>>> cpufreq-dt driver, which gives a following warning during system >>>> suspend/resume cycle: >>> Marek, >>> >>> I have sent a patchset which is not directly related to the problem >>> you are facing, but it may solve your problem as a side effect. Can >>> you please see if that works to solve this issue as well ? >>> >>> https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u >> Thanks for the patch. It indeed solves the problem of the i2c transfer >> in cpu hotplug procedure during system resume, although my resources >> management rewrite is still valid as a way to fix remaining 'todos' in >> the cpufreq-dt driver. > Which remaining todos are you talking about ? Sorry I am not able to > recall :( https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/cpufreq/cpufreq-dt.c#n347 Best regards
On Mon, Feb 11, 2019 at 02:17:14PM +0530, Viresh Kumar wrote: > On 08-02-19, 17:41, Sudeep Holla wrote: > > Based on Rafael's suggestion, I cooked up something. See if this helps ? > > The policy to cpu dance can be removed and we can just run through the > > online cpumask I think. > > > > Regards, > > Sudeep > > > > -->8 > > > > diff --git i/drivers/cpufreq/cpufreq.c w/drivers/cpufreq/cpufreq.c > > index e35a886e00bc..03d65a02a542 100644 > > --- i/drivers/cpufreq/cpufreq.c > > +++ w/drivers/cpufreq/cpufreq.c > > @@ -1640,6 +1640,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend); > > void cpufreq_suspend(void) > > { > > struct cpufreq_policy *policy; > > + int cpu; > > > > if (!cpufreq_driver) > > return; > > @@ -1662,6 +1663,11 @@ void cpufreq_suspend(void) > > } > > > > suspend: > > + if (cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME) > > + for_each_active_policy(policy) > > + for_each_cpu(cpu, policy->cpus) > > + cpufreq_offline(cpu); > > You will offline boot-cpu as well :) > Indeed, I was just trying to check the idea of flags and clearly missed the boot cpu :( [..] > > @@ -2444,14 +2455,16 @@ static enum cpuhp_state hp_online; > > > > static int cpuhp_cpufreq_online(unsigned int cpu) > > { > > - cpufreq_online(cpu); > > + if (!(cpufreq_driver->flags & CPUFREQ_DEFER_INIT_DURING_RESUME)) > > + cpufreq_online(cpu); > > This isn't correct as we can offline the CPUs without suspend as well > and cpufreq_online/offline should always be called in such cases. > Understood > Anyways, I have cc'd you on another series which may end up fixing > this problem as well. Sure, will have a look. -- Regards, Sudeep
On 11-02-19, 13:22, Marek Szyprowski wrote: > Hi Viresh, > > On 2019-02-11 10:55, Viresh Kumar wrote: > > On 11-02-19, 10:52, Marek Szyprowski wrote: > >> On 2019-02-11 09:44, Viresh Kumar wrote: > >>> On 07-02-19, 13:22, Marek Szyprowski wrote: > >>>> Dear All, > >>>> > >>>> Recent commit 9ac6cb5fbb17 ("i2c: add suspended flag and accessors for > >>>> i2c adapters") added a visible warning for an attempt to do i2c transfer > >>>> over a suspended i2c bus. This revealed a long standing issue in the > >>>> cpufreq-dt driver, which gives a following warning during system > >>>> suspend/resume cycle: > >>> Marek, > >>> > >>> I have sent a patchset which is not directly related to the problem > >>> you are facing, but it may solve your problem as a side effect. Can > >>> you please see if that works to solve this issue as well ? > >>> > >>> https://lore.kernel.org/lkml/cover.1549874368.git.viresh.kumar@linaro.org/T/#u > >> Thanks for the patch. It indeed solves the problem of the i2c transfer > >> in cpu hotplug procedure during system resume, although my resources > >> management rewrite is still valid as a way to fix remaining 'todos' in > >> the cpufreq-dt driver. > > Which remaining todos are you talking about ? Sorry I am not able to > > recall :( > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/cpufreq/cpufreq-dt.c#n347 Ah, okay. Thanks for the pointer. Ideally we shouldn't be doing anything in probe, like resources_available(), but we are forced to do it as subsys_interface_register() doesn't return the errors returned from cpufreq_add_dev() currently. I tried to fix it once but there were some complications I believe and then left it. The main problem is that if cpufreq_online() doesn't find the required resources and returns -EPROBE_DEFER, it never comes back to the probe() routine which registers the cpufreq driver. And so we have to add the duplicate checks in probe() itself before it registers the cpufreq driver. Now all we need to do there is to guarantee that the resources are available when the cpufreq driver registers and so we do it only for CPU0 currently. Logically speaking, if the resources are available for CPU0, it will normally be available for any other CPU as well and so there never was a requirement to test it for other CPUs. And so I left a FIXME there, so that we know what's going on there in case a platform comes up for which it doesn't work. I won't fix it with something like what this patch series tried to do, rather I would try to make sure that EPROBE_DEFER gets returned from cpufreq_driver_register() instead. Thanks.