Message ID | 20200804061210.5415-1-m.szyprowski@samsung.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get() | expand |
On Tue, Aug 04, 2020 at 08:12:10AM +0200, Marek Szyprowski wrote: > exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for > devfreq event counter is not yet probed. Propagate that error value to > the caller to ensure that the exynos5422-dmc driver will be probed again > when devfreq event contuner is available. > > This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are > compiled as modules. > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > drivers/memory/samsung/exynos5422-dmc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Thanks, looks good. I'll apply it to fixes after merge window. Best regards, Krzysztof
Hi Marek, On 8/4/20 7:12 AM, Marek Szyprowski wrote: > exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for > devfreq event counter is not yet probed. Propagate that error value to > the caller to ensure that the exynos5422-dmc driver will be probed again > when devfreq event contuner is available. > > This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are > compiled as modules. > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > drivers/memory/samsung/exynos5422-dmc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c > index b9c7956e5031..639811a3eecb 100644 > --- a/drivers/memory/samsung/exynos5422-dmc.c > +++ b/drivers/memory/samsung/exynos5422-dmc.c > @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device *dev, > } else { > ret = exynos5_counters_get(dmc, &load, &total); > if (ret < 0) > - return -EINVAL; > + return ret; > > /* To protect from overflow, divide by 1024 */ > stat->busy_time = load >> 10; > Thank you for the patch, LGTM. Some questions are still there, though. The function exynos5_performance_counters_init() should capture that the counters couldn't be enabled or set. So the functions: exynos5_counters_enable_edev() and exynos5_counters_set_event() must pass gently because devfreq device is registered. Then devfreq checks device status, and reaches the state when counters 'get' function returns that they are not ready... If that is a kind of 2-stage initialization, maybe we should add another 'check' in the exynos5_performance_counters_init() and call the devfreq_event_get_event() to make sure that we are ready to go, otherwise return ret from that function (which is probably EPROBE_DEFER) and not register the devfreq device. Marek do want to submit such patch or I should bake it and submit on top of this patch? Could you tell me how I can reproduce this? Do you simply load one module after another (exynos-ppmu than exynos5422-dmc) or in parallel? Regards, Lukasz
Hi Lukasz, On 04.08.2020 11:11, Lukasz Luba wrote: > Hi Marek, > > On 8/4/20 7:12 AM, Marek Szyprowski wrote: >> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for >> devfreq event counter is not yet probed. Propagate that error value to >> the caller to ensure that the exynos5422-dmc driver will be probed again >> when devfreq event contuner is available. >> >> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are >> compiled as modules. >> >> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >> --- >> drivers/memory/samsung/exynos5422-dmc.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/memory/samsung/exynos5422-dmc.c >> b/drivers/memory/samsung/exynos5422-dmc.c >> index b9c7956e5031..639811a3eecb 100644 >> --- a/drivers/memory/samsung/exynos5422-dmc.c >> +++ b/drivers/memory/samsung/exynos5422-dmc.c >> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device >> *dev, >> } else { >> ret = exynos5_counters_get(dmc, &load, &total); >> if (ret < 0) >> - return -EINVAL; >> + return ret; >> /* To protect from overflow, divide by 1024 */ >> stat->busy_time = load >> 10; >> > > Thank you for the patch, LGTM. > Some questions are still there, though. The function > exynos5_performance_counters_init() should capture that the counters > couldn't be enabled or set. So the functions: > exynos5_counters_enable_edev() and exynos5_counters_set_event() > must pass gently because devfreq device is registered. > Then devfreq checks device status, and reaches the state when > counters 'get' function returns that they are not ready... > > If that is a kind of 2-stage initialization, maybe we should add > another 'check' in the exynos5_performance_counters_init() and call > the devfreq_event_get_event() to make sure that we are ready to go, > otherwise return ret from that function (which is probably EPROBE_DEFER) > and not register the devfreq device. I've finally investigated this further and it turned out that the issue is elsewhere. The $subject patch can be discarded, as it doesn't fix anything. The -EPROBE_DEFER is properly returned by exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks what in turn *sometimes* causes boot hang. This random behavior mislead me that the $subject patch fixes the issue, but then longer tests revealed that it didn't change anything. It looks that the proper fix would be to keep fout_bpll enabled all the time. > > Marek do want to submit such patch or I should bake it and submit on top > of this patch? > > Could you tell me how I can reproduce this? Do you simply load one > module after another (exynos-ppmu than exynos5422-dmc) or in parallel? I've just boot zImage built from multi_v7_defconfig with modules installed. Modules are automatically loaded by udev during boot. Best regards
On 8/4/20 1:19 PM, Marek Szyprowski wrote: > Hi Lukasz, > > On 04.08.2020 11:11, Lukasz Luba wrote: >> Hi Marek, >> >> On 8/4/20 7:12 AM, Marek Szyprowski wrote: >>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for >>> devfreq event counter is not yet probed. Propagate that error value to >>> the caller to ensure that the exynos5422-dmc driver will be probed again >>> when devfreq event contuner is available. >>> >>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are >>> compiled as modules. >>> >>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>> --- >>> drivers/memory/samsung/exynos5422-dmc.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c >>> b/drivers/memory/samsung/exynos5422-dmc.c >>> index b9c7956e5031..639811a3eecb 100644 >>> --- a/drivers/memory/samsung/exynos5422-dmc.c >>> +++ b/drivers/memory/samsung/exynos5422-dmc.c >>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device >>> *dev, >>> } else { >>> ret = exynos5_counters_get(dmc, &load, &total); >>> if (ret < 0) >>> - return -EINVAL; >>> + return ret; >>> /* To protect from overflow, divide by 1024 */ >>> stat->busy_time = load >> 10; >>> >> >> Thank you for the patch, LGTM. >> Some questions are still there, though. The function >> exynos5_performance_counters_init() should capture that the counters >> couldn't be enabled or set. So the functions: >> exynos5_counters_enable_edev() and exynos5_counters_set_event() >> must pass gently because devfreq device is registered. >> Then devfreq checks device status, and reaches the state when >> counters 'get' function returns that they are not ready... >> >> If that is a kind of 2-stage initialization, maybe we should add >> another 'check' in the exynos5_performance_counters_init() and call >> the devfreq_event_get_event() to make sure that we are ready to go, >> otherwise return ret from that function (which is probably EPROBE_DEFER) >> and not register the devfreq device. > > I've finally investigated this further and it turned out that the issue > is elsewhere. The $subject patch can be discarded, as it doesn't fix > anything. The -EPROBE_DEFER is properly returned by > exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() > to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks > what in turn *sometimes* causes boot hang. This random behavior mislead > me that the $subject patch fixes the issue, but then longer tests > revealed that it didn't change anything. Really good investigation, great work Marek! > > It looks that the proper fix would be to keep fout_bpll enabled all the > time. Yes, I agree. I am looking for your next patch to test it then. > >> >> Marek do want to submit such patch or I should bake it and submit on top >> of this patch? >> >> Could you tell me how I can reproduce this? Do you simply load one >> module after another (exynos-ppmu than exynos5422-dmc) or in parallel? > > I've just boot zImage built from multi_v7_defconfig with modules > installed. Modules are automatically loaded by udev during boot. Thank you sharing this test procedure. Regards, Lukasz
On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote: > > > On 8/4/20 1:19 PM, Marek Szyprowski wrote: > > Hi Lukasz, > > > > On 04.08.2020 11:11, Lukasz Luba wrote: > > > Hi Marek, > > > > > > On 8/4/20 7:12 AM, Marek Szyprowski wrote: > > > > exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for > > > > devfreq event counter is not yet probed. Propagate that error value to > > > > the caller to ensure that the exynos5422-dmc driver will be probed again > > > > when devfreq event contuner is available. > > > > > > > > This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are > > > > compiled as modules. > > > > > > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > > > > --- > > > > drivers/memory/samsung/exynos5422-dmc.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/memory/samsung/exynos5422-dmc.c > > > > b/drivers/memory/samsung/exynos5422-dmc.c > > > > index b9c7956e5031..639811a3eecb 100644 > > > > --- a/drivers/memory/samsung/exynos5422-dmc.c > > > > +++ b/drivers/memory/samsung/exynos5422-dmc.c > > > > @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device > > > > *dev, > > > > } else { > > > > ret = exynos5_counters_get(dmc, &load, &total); > > > > if (ret < 0) > > > > - return -EINVAL; > > > > + return ret; > > > > /* To protect from overflow, divide by 1024 */ > > > > stat->busy_time = load >> 10; > > > > > > > > > > Thank you for the patch, LGTM. > > > Some questions are still there, though. The function > > > exynos5_performance_counters_init() should capture that the counters > > > couldn't be enabled or set. So the functions: > > > exynos5_counters_enable_edev() and exynos5_counters_set_event() > > > must pass gently because devfreq device is registered. > > > Then devfreq checks device status, and reaches the state when > > > counters 'get' function returns that they are not ready... > > > > > > If that is a kind of 2-stage initialization, maybe we should add > > > another 'check' in the exynos5_performance_counters_init() and call > > > the devfreq_event_get_event() to make sure that we are ready to go, > > > otherwise return ret from that function (which is probably EPROBE_DEFER) > > > and not register the devfreq device. > > > > I've finally investigated this further and it turned out that the issue > > is elsewhere. The $subject patch can be discarded, as it doesn't fix > > anything. The -EPROBE_DEFER is properly returned by > > exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() > > to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks > > what in turn *sometimes* causes boot hang. This random behavior mislead > > me that the $subject patch fixes the issue, but then longer tests > > revealed that it didn't change anything. > > Really good investigation, great work Marek! > > > > > It looks that the proper fix would be to keep fout_bpll enabled all the > > time. > > Yes, I agree. I am looking for your next patch to test it then. Hi all, Is the patch still useful then? Shall I apply it? Best regards, Krzysztof
On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote: > On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote: >> >> >> On 8/4/20 1:19 PM, Marek Szyprowski wrote: >>> Hi Lukasz, >>> >>> On 04.08.2020 11:11, Lukasz Luba wrote: >>>> Hi Marek, >>>> >>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote: >>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for >>>>> devfreq event counter is not yet probed. Propagate that error value to >>>>> the caller to ensure that the exynos5422-dmc driver will be probed again >>>>> when devfreq event contuner is available. >>>>> >>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are >>>>> compiled as modules. >>>>> >>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>> --- >>>>> drivers/memory/samsung/exynos5422-dmc.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c >>>>> b/drivers/memory/samsung/exynos5422-dmc.c >>>>> index b9c7956e5031..639811a3eecb 100644 >>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c >>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c >>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device >>>>> *dev, >>>>> } else { >>>>> ret = exynos5_counters_get(dmc, &load, &total); >>>>> if (ret < 0) >>>>> - return -EINVAL; >>>>> + return ret; >>>>> /* To protect from overflow, divide by 1024 */ >>>>> stat->busy_time = load >> 10; >>>>> >>>> >>>> Thank you for the patch, LGTM. >>>> Some questions are still there, though. The function >>>> exynos5_performance_counters_init() should capture that the counters >>>> couldn't be enabled or set. So the functions: >>>> exynos5_counters_enable_edev() and exynos5_counters_set_event() >>>> must pass gently because devfreq device is registered. >>>> Then devfreq checks device status, and reaches the state when >>>> counters 'get' function returns that they are not ready... >>>> >>>> If that is a kind of 2-stage initialization, maybe we should add >>>> another 'check' in the exynos5_performance_counters_init() and call >>>> the devfreq_event_get_event() to make sure that we are ready to go, >>>> otherwise return ret from that function (which is probably EPROBE_DEFER) >>>> and not register the devfreq device. >>> >>> I've finally investigated this further and it turned out that the issue >>> is elsewhere. The $subject patch can be discarded, as it doesn't fix >>> anything. The -EPROBE_DEFER is properly returned by >>> exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() >>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks >>> what in turn *sometimes* causes boot hang. This random behavior mislead >>> me that the $subject patch fixes the issue, but then longer tests >>> revealed that it didn't change anything. >> >> Really good investigation, great work Marek! >> >>> >>> It looks that the proper fix would be to keep fout_bpll enabled all the >>> time. >> >> Yes, I agree. I am looking for your next patch to test it then. > > Hi all, > > Is the patch still useful then? Shall I apply it? Marek has created different patch for it, which fixes the clock: https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/ So you don't have to apply this one IMO. Regards, Lukasz
Hi, On 17.08.2020 14:27, Lukasz Luba wrote: > On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote: >> On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote: >>> On 8/4/20 1:19 PM, Marek Szyprowski wrote: >>>> On 04.08.2020 11:11, Lukasz Luba wrote: >>>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote: >>>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the >>>>>> driver for >>>>>> devfreq event counter is not yet probed. Propagate that error >>>>>> value to >>>>>> the caller to ensure that the exynos5422-dmc driver will be >>>>>> probed again >>>>>> when devfreq event contuner is available. >>>>>> >>>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu >>>>>> drivers are >>>>>> compiled as modules. >>>>>> >>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>>> --- >>>>>> drivers/memory/samsung/exynos5422-dmc.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c >>>>>> b/drivers/memory/samsung/exynos5422-dmc.c >>>>>> index b9c7956e5031..639811a3eecb 100644 >>>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c >>>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c >>>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device >>>>>> *dev, >>>>>> } else { >>>>>> ret = exynos5_counters_get(dmc, &load, &total); >>>>>> if (ret < 0) >>>>>> - return -EINVAL; >>>>>> + return ret; >>>>>> /* To protect from overflow, divide by 1024 */ >>>>>> stat->busy_time = load >> 10; >>>>>> >>>>> >>>>> Thank you for the patch, LGTM. >>>>> Some questions are still there, though. The function >>>>> exynos5_performance_counters_init() should capture that the counters >>>>> couldn't be enabled or set. So the functions: >>>>> exynos5_counters_enable_edev() and exynos5_counters_set_event() >>>>> must pass gently because devfreq device is registered. >>>>> Then devfreq checks device status, and reaches the state when >>>>> counters 'get' function returns that they are not ready... >>>>> >>>>> If that is a kind of 2-stage initialization, maybe we should add >>>>> another 'check' in the exynos5_performance_counters_init() and call >>>>> the devfreq_event_get_event() to make sure that we are ready to go, >>>>> otherwise return ret from that function (which is probably >>>>> EPROBE_DEFER) >>>>> and not register the devfreq device. >>>> >>>> I've finally investigated this further and it turned out that the >>>> issue >>>> is elsewhere. The $subject patch can be discarded, as it doesn't fix >>>> anything. The -EPROBE_DEFER is properly returned by >>>> exynos5_performance_counters_init(), which redirects >>>> exynos5_dmc_probe() >>>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll >>>> clocks >>>> what in turn *sometimes* causes boot hang. This random behavior >>>> mislead >>>> me that the $subject patch fixes the issue, but then longer tests >>>> revealed that it didn't change anything. >>> >>> Really good investigation, great work Marek! >>> >>>> >>>> It looks that the proper fix would be to keep fout_bpll enabled all >>>> the >>>> time. >>> >>> Yes, I agree. I am looking for your next patch to test it then. >> >> Hi all, >> >> Is the patch still useful then? Shall I apply it? > > Marek has created different patch for it, which fixes the clock: > https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/ > > > So you don't have to apply this one IMO. Indeed, you can drop this one. Best regards
diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c index b9c7956e5031..639811a3eecb 100644 --- a/drivers/memory/samsung/exynos5422-dmc.c +++ b/drivers/memory/samsung/exynos5422-dmc.c @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device *dev, } else { ret = exynos5_counters_get(dmc, &load, &total); if (ret < 0) - return -EINVAL; + return ret; /* To protect from overflow, divide by 1024 */ stat->busy_time = load >> 10;
exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for devfreq event counter is not yet probed. Propagate that error value to the caller to ensure that the exynos5422-dmc driver will be probed again when devfreq event contuner is available. This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are compiled as modules. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- drivers/memory/samsung/exynos5422-dmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)