diff mbox series

clk: zynqmp: pll: Fix divider calculation to avoid out-of-range rate

Message ID 20220928201656.30318-1-laurent.pinchart@ideasonboard.com (mailing list archive)
State New, archived
Headers show
Series clk: zynqmp: pll: Fix divider calculation to avoid out-of-range rate | expand

Commit Message

Laurent Pinchart Sept. 28, 2022, 8:16 p.m. UTC
When calculating the divider in zynqmp_pll_round_rate() and
zynqmp_pll_set_rate(), the division rounding error may result in an
output frequency that is slightly outside of the PLL output range if the
requested range is close to the low or high limit. As a result, the
limits check in clk_calc_new_rates() would fail, and clk_set_rate()
would return an error, as seen in the zynqmp-dpsub driver:

[   13.672309] zynqmp-dpsub fd4a0000.display: failed to set pixel clock rate to 297000000 (-22)

Fix this by adjusting the divider. The rate calculation then succeeds
for zynqmp-dpsub:

[   13.554849] zynqmp-dpsub fd4a0000.display: requested pixel rate: 297000000 actual rate: 255555553

The resulting PLL configuration, however, is not optimal, as the rate
error is 14%. The hardware can do much better, but CCF doesn't attempt
to find the best overall configuration for cascaded clocks. That's a
candidate for a separate fix.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/clk/zynqmp/pll.c | 8 ++++++++
 1 file changed, 8 insertions(+)


base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555

Comments

Quanyang Wang Sept. 29, 2022, 1:05 a.m. UTC | #1
Hi Laurent,

I have sent a patch as below to fix this issue which set rate failed and 
it's in linux-next repo now.

https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/


As for the frequency gap between the requested rate and the actual, it's 
because of the commit:

commit 948fb0969eae8
Author: Maxime Ripard <maxime@cerno.tech>
Date:   Fri Feb 25 15:35:26 2022 +0100

     clk: Always clamp the rounded rate

And I haven't figured out how to fix it.

Thanks,

Quanyang

On 9/29/22 04:16, Laurent Pinchart wrote:
> When calculating the divider in zynqmp_pll_round_rate() and
> zynqmp_pll_set_rate(), the division rounding error may result in an
> output frequency that is slightly outside of the PLL output range if the
> requested range is close to the low or high limit. As a result, the
> limits check in clk_calc_new_rates() would fail, and clk_set_rate()
> would return an error, as seen in the zynqmp-dpsub driver:
>
> [   13.672309] zynqmp-dpsub fd4a0000.display: failed to set pixel clock rate to 297000000 (-22)
>
> Fix this by adjusting the divider. The rate calculation then succeeds
> for zynqmp-dpsub:
>
> [   13.554849] zynqmp-dpsub fd4a0000.display: requested pixel rate: 297000000 actual rate: 255555553
>
> The resulting PLL configuration, however, is not optimal, as the rate
> error is 14%. The hardware can do much better, but CCF doesn't attempt
> to find the best overall configuration for cascaded clocks. That's a
> candidate for a separate fix.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>   drivers/clk/zynqmp/pll.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> diff --git a/drivers/clk/zynqmp/pll.c b/drivers/clk/zynqmp/pll.c
> index 91a6b4cc910e..be6fa44a21e0 100644
> --- a/drivers/clk/zynqmp/pll.c
> +++ b/drivers/clk/zynqmp/pll.c
> @@ -120,6 +120,10 @@ static long zynqmp_pll_round_rate(struct clk_hw *hw, unsigned long rate,
>   	}
>   
>   	fbdiv = DIV_ROUND_CLOSEST(rate, *prate);
> +	if (*prate * fbdiv < PS_PLL_VCO_MIN)
> +		fbdiv++;
> +	if (*prate * fbdiv > PS_PLL_VCO_MAX)
> +		fbdiv--;
>   	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
>   	return *prate * fbdiv;
>   }
> @@ -208,6 +212,10 @@ static int zynqmp_pll_set_rate(struct clk_hw *hw, unsigned long rate,
>   	}
>   
>   	fbdiv = DIV_ROUND_CLOSEST(rate, parent_rate);
> +	if (parent_rate * fbdiv < PS_PLL_VCO_MIN)
> +		fbdiv++;
> +	else if (parent_rate * fbdiv > PS_PLL_VCO_MAX)
> +		fbdiv--;
>   	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
>   	ret = zynqmp_pm_clock_setdivider(clk_id, fbdiv);
>   	if (ret)
>
> base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555
Stephen Boyd Oct. 1, 2022, 12:05 a.m. UTC | #2
+Maxime

Quoting Quanyang Wang (2022-09-28 18:05:10)
> Hi Laurent,
> 
> I have sent a patch as below to fix this issue which set rate failed and 
> it's in linux-next repo now.
> 
> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> 
> 
> As for the frequency gap between the requested rate and the actual, it's 
> because of the commit:
> 
> commit 948fb0969eae8
> Author: Maxime Ripard <maxime@cerno.tech>
> Date:   Fri Feb 25 15:35:26 2022 +0100
> 
>      clk: Always clamp the rounded rate
> 
> And I haven't figured out how to fix it.
> 

Maxime has some more patches to fix this and they're in linux-next.
Maybe those fix this problem?
Maxime Ripard Oct. 1, 2022, 10:40 a.m. UTC | #3
Hi

On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
> +Maxime
> 
> Quoting Quanyang Wang (2022-09-28 18:05:10)
> > Hi Laurent,
> > 
> > I have sent a patch as below to fix this issue which set rate failed and 
> > it's in linux-next repo now.
> > 
> > https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> > 

It looks to me that the fundamental issue is that, in some situations,
the round_rate implementation can return a rate outside of the
boundaries enforced on a clock.

I think that's the current behaviour (that was there prior to my
patches) to reject any rate outside of the boundaries in
clk_calc_new_rates() makes it clear that it's not something we should
allow.

I'm a bit two-minded on this though. All the failures of that test I've
seen actually turned out to be bugs, so I guess it's useful, but it's
also true that for rounding errors it's a bit overkill. We could also
relax that check and warn instead of failing.

> > As for the frequency gap between the requested rate and the actual, it's 
> > because of the commit:
> > 
> > commit 948fb0969eae8
> > Author: Maxime Ripard <maxime@cerno.tech>
> > Date:   Fri Feb 25 15:35:26 2022 +0100
> > 
> >      clk: Always clamp the rounded rate
> > 
> > And I haven't figured out how to fix it.

Again, it boils down on whether or not we should allow a rate outside of
boundaries. If we don't and if the clock can't do better, then yeah, the
rate difference is fairly big but we can't do better.

> Maxime has some more patches to fix this and they're in linux-next.
> Maybe those fix this problem?

I don't think they will fix it. However, depending on the outcome of
that discussion I can send more fixes your way :)

Maxime
Quanyang Wang Oct. 2, 2022, 2:17 a.m. UTC | #4
Hi Maxime,

On 10/1/22 18:40, Maxime Ripard wrote:
> Hi
> 
> On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
>> +Maxime
>>
>> Quoting Quanyang Wang (2022-09-28 18:05:10)
>>> Hi Laurent,
>>>
>>> I have sent a patch as below to fix this issue which set rate failed and
>>> it's in linux-next repo now.
>>>
>>> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
>>>
> 
> It looks to me that the fundamental issue is that, in some situations,
> the round_rate implementation can return a rate outside of the
> boundaries enforced on a clock.
In my limited view, the round_rate callbacks should return a rate within 
boundaries as output, but can take a rate outside of boundaries as input.

Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A 
consumer dp_video_ref wants a 200MHz rate, its request walks upward 
through multiplexers and dividers then reaches to VPLL, VPLL receives 
this 200MHz request and call  zynqmp_pll_round_rate to "round" this 
out-of-range rate 200MHz to 1600MHz via multiplying by 8. 
zynqmp_pll_round_rate returns 1600MHz and clk subsystem will call 
determine callbacks to configure dividers correctly to make sure that 
dp_video_ref can get an exact rate 200MHz.

But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds

req->rate = clamp(req->rate, req->min_rate, req->max_rate);

before

rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);

This results that .round_rate callbacks lose functionality since they 
have no chance to pick up a precise rate but only a boundary rate.
Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to 
1500MHz by clamp function and then zynqmp_pll_round_rate does nothing, 
dp_video_ref will finally receive a rate which is 1500MHz/8 = 187.5MHz.
The rate gap (200MHz-187.5MHz) happens.

There is no doubt that round_rate should return a valid rate as output. 
But is it necessary that forces the input of round_rate callbacks to be 
within boundaries?

Thanks,
Quanyang
> 
> I think that's the current behaviour (that was there prior to my
> patches) to reject any rate outside of the boundaries in
> clk_calc_new_rates() makes it clear that it's not something we should
> allow.
> 
> I'm a bit two-minded on this though. All the failures of that test I've
> seen actually turned out to be bugs, so I guess it's useful, but it's
> also true that for rounding errors it's a bit overkill. We could also
> relax that check and warn instead of failing.
> 
>>> As for the frequency gap between the requested rate and the actual, it's
>>> because of the commit:
>>>
>>> commit 948fb0969eae8
>>> Author: Maxime Ripard <maxime@cerno.tech>
>>> Date:   Fri Feb 25 15:35:26 2022 +0100
>>>
>>>       clk: Always clamp the rounded rate
>>>
>>> And I haven't figured out how to fix it.
> 
> Again, it boils down on whether or not we should allow a rate outside of
> boundaries. If we don't and if the clock can't do better, then yeah, the
> rate difference is fairly big but we can't do better.
> 
>> Maxime has some more patches to fix this and they're in linux-next.
>> Maybe those fix this problem?
> 
> I don't think they will fix it. However, depending on the outcome of
> that discussion I can send more fixes your way :)
> 
> Maxime
Laurent Pinchart Oct. 2, 2022, 11:45 p.m. UTC | #5
Hi Quanyang,

On Thu, Sep 29, 2022 at 09:05:10AM +0800, Quanyang Wang wrote:
> Hi Laurent,
> 
> I have sent a patch as below to fix this issue which set rate failed and 
> it's in linux-next repo now.
> 
> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/

I should have searched the mailing list better before sending a patch,
sorry.

I've tested your patch, and it fixes the problem too. The resulting
pixel frequency is even more off though:

[   66.741024] zynqmp-dpsub fd4a0000.display: requested pixel rate: 297000000 actual rate: 249999998

But that's a separate issue.

> As for the frequency gap between the requested rate and the actual, it's 
> because of the commit:
> 
> commit 948fb0969eae8
> Author: Maxime Ripard <maxime@cerno.tech>
> Date:   Fri Feb 25 15:35:26 2022 +0100
> 
>      clk: Always clamp the rounded rate
> 
> And I haven't figured out how to fix it.
> 
> Thanks,
> 
> Quanyang
> 
> On 9/29/22 04:16, Laurent Pinchart wrote:
> > When calculating the divider in zynqmp_pll_round_rate() and
> > zynqmp_pll_set_rate(), the division rounding error may result in an
> > output frequency that is slightly outside of the PLL output range if the
> > requested range is close to the low or high limit. As a result, the
> > limits check in clk_calc_new_rates() would fail, and clk_set_rate()
> > would return an error, as seen in the zynqmp-dpsub driver:
> >
> > [   13.672309] zynqmp-dpsub fd4a0000.display: failed to set pixel clock rate to 297000000 (-22)
> >
> > Fix this by adjusting the divider. The rate calculation then succeeds
> > for zynqmp-dpsub:
> >
> > [   13.554849] zynqmp-dpsub fd4a0000.display: requested pixel rate: 297000000 actual rate: 255555553
> >
> > The resulting PLL configuration, however, is not optimal, as the rate
> > error is 14%. The hardware can do much better, but CCF doesn't attempt
> > to find the best overall configuration for cascaded clocks. That's a
> > candidate for a separate fix.
> >
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >   drivers/clk/zynqmp/pll.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/clk/zynqmp/pll.c b/drivers/clk/zynqmp/pll.c
> > index 91a6b4cc910e..be6fa44a21e0 100644
> > --- a/drivers/clk/zynqmp/pll.c
> > +++ b/drivers/clk/zynqmp/pll.c
> > @@ -120,6 +120,10 @@ static long zynqmp_pll_round_rate(struct clk_hw *hw, unsigned long rate,
> >   	}
> >   
> >   	fbdiv = DIV_ROUND_CLOSEST(rate, *prate);
> > +	if (*prate * fbdiv < PS_PLL_VCO_MIN)
> > +		fbdiv++;
> > +	if (*prate * fbdiv > PS_PLL_VCO_MAX)
> > +		fbdiv--;
> >   	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
> >   	return *prate * fbdiv;
> >   }
> > @@ -208,6 +212,10 @@ static int zynqmp_pll_set_rate(struct clk_hw *hw, unsigned long rate,
> >   	}
> >   
> >   	fbdiv = DIV_ROUND_CLOSEST(rate, parent_rate);
> > +	if (parent_rate * fbdiv < PS_PLL_VCO_MIN)
> > +		fbdiv++;
> > +	else if (parent_rate * fbdiv > PS_PLL_VCO_MAX)
> > +		fbdiv--;
> >   	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
> >   	ret = zynqmp_pm_clock_setdivider(clk_id, fbdiv);
> >   	if (ret)
> >
> > base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555
Laurent Pinchart Oct. 3, 2022, 12:06 a.m. UTC | #6
Hello,

On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
> On 10/1/22 18:40, Maxime Ripard wrote:
> > On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
> >> +Maxime
> >>
> >> Quoting Quanyang Wang (2022-09-28 18:05:10)
> >>> Hi Laurent,
> >>>
> >>> I have sent a patch as below to fix this issue which set rate failed and
> >>> it's in linux-next repo now.
> >>>
> >>> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> >>>
> > 
> > It looks to me that the fundamental issue is that, in some situations,
> > the round_rate implementation can return a rate outside of the
> > boundaries enforced on a clock.
>
> In my limited view, the round_rate callbacks should return a rate within 
> boundaries as output, but can take a rate outside of boundaries as input.
> 
> Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A 
> consumer dp_video_ref wants a 200MHz rate, its request walks upward 
> through multiplexers and dividers then reaches to VPLL, VPLL receives 
> this 200MHz request and call  zynqmp_pll_round_rate to "round" this 
> out-of-range rate 200MHz to 1600MHz via multiplying by 8. 

I don't think that's the right way to look at it. Making the VPLL driver
multiply the requested 200MHz rate by 8 assumes that it knows that the
next block in the clock tree is a divider capable of dividing by 8.
That's not something the VPLL driver should know about. The next block
may be able to divide by 10 and up only, which means that the VPLL
should then be configured to output 2000 MHz.

The problem that needs to be solved is to find a global optimal
configuration for cascaded PLLs and dividers. 

> zynqmp_pll_round_rate returns 1600MHz and clk subsystem will call 
> determine callbacks to configure dividers correctly to make sure that 
> dp_video_ref can get an exact rate 200MHz.
> 
> But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
> 
> req->rate = clamp(req->rate, req->min_rate, req->max_rate);
> 
> before
> 
> rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
> 
> This results that .round_rate callbacks lose functionality since they 
> have no chance to pick up a precise rate but only a boundary rate.
> Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to 
> 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing, 
> dp_video_ref will finally receive a rate which is 1500MHz/8 = 187.5MHz.
> The rate gap (200MHz-187.5MHz) happens.
> 
> There is no doubt that round_rate should return a valid rate as output. 
> But is it necessary that forces the input of round_rate callbacks to be 
> within boundaries?
> 
> > I think that's the current behaviour (that was there prior to my
> > patches) to reject any rate outside of the boundaries in
> > clk_calc_new_rates() makes it clear that it's not something we should
> > allow.
> > 
> > I'm a bit two-minded on this though. All the failures of that test I've
> > seen actually turned out to be bugs, so I guess it's useful, but it's
> > also true that for rounding errors it's a bit overkill. We could also
> > relax that check and warn instead of failing.
> > 
> >>> As for the frequency gap between the requested rate and the actual, it's
> >>> because of the commit:
> >>>
> >>> commit 948fb0969eae8
> >>> Author: Maxime Ripard <maxime@cerno.tech>
> >>> Date:   Fri Feb 25 15:35:26 2022 +0100
> >>>
> >>>       clk: Always clamp the rounded rate
> >>>
> >>> And I haven't figured out how to fix it.
> > 
> > Again, it boils down on whether or not we should allow a rate outside of
> > boundaries. If we don't and if the clock can't do better, then yeah, the
> > rate difference is fairly big but we can't do better.
> > 
> >> Maxime has some more patches to fix this and they're in linux-next.
> >> Maybe those fix this problem?
> > 
> > I don't think they will fix it. However, depending on the outcome of
> > that discussion I can send more fixes your way :)
> > 
> > Maxime
Maxime Ripard Oct. 10, 2022, 8:49 a.m. UTC | #7
Hi,

On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
> On 10/1/22 18:40, Maxime Ripard wrote:
> > Hi
> > 
> > On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
> > > +Maxime
> > > 
> > > Quoting Quanyang Wang (2022-09-28 18:05:10)
> > > > Hi Laurent,
> > > > 
> > > > I have sent a patch as below to fix this issue which set rate failed and
> > > > it's in linux-next repo now.
> > > > 
> > > > https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> > > > 
> > 
> > It looks to me that the fundamental issue is that, in some situations,
> > the round_rate implementation can return a rate outside of the
> > boundaries enforced on a clock.
>
> In my limited view, the round_rate callbacks should return a rate within
> boundaries as output,

I guess it would be s/should/must/, but yeah, we agree.

> but can take a rate outside of boundaries as input.

I'm not sure what that would change though?

> Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A
> consumer dp_video_ref wants a 200MHz rate, its request walks upward through
> multiplexers and dividers then reaches to VPLL, VPLL receives this 200MHz
> request and call  zynqmp_pll_round_rate to "round" this out-of-range rate
> 200MHz to 1600MHz via multiplying by 8. zynqmp_pll_round_rate returns
> 1600MHz and clk subsystem will call determine callbacks to configure
> dividers correctly to make sure that dp_video_ref can get an exact rate
> 200MHz.

Sounds good to me indeed.

> But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
> 
> req->rate = clamp(req->rate, req->min_rate, req->max_rate);
> 
> before
> 
> rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
> 
> This results that .round_rate callbacks lose functionality since they have
> no chance to pick up a precise rate but only a boundary rate.
> Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to
> 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing,

I'm a bit confused now.

If I understand your clock topology, you have a PLL, and then a divider
of 8, and want the final clock to be running at 200MHz?

If so, the divider should call its parent round/determine_rate with 200
* 8 MHz = 1600MHz, which is is still inside the boundaries of 1.5-3.0GHz
and won't be affected?

Why should the child be affected by the parent boundaries, or the other
way around?

Maxime
Quanyang Wang Oct. 10, 2022, 12:12 p.m. UTC | #8
Hi Maxime,

On 10/10/22 16:49, Maxime Ripard wrote:
> Hi,
> 
> On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
>> On 10/1/22 18:40, Maxime Ripard wrote:
>>> Hi
>>>
>>> On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
>>>> +Maxime
>>>>
>>>> Quoting Quanyang Wang (2022-09-28 18:05:10)
>>>>> Hi Laurent,
>>>>>
>>>>> I have sent a patch as below to fix this issue which set rate failed and
>>>>> it's in linux-next repo now.
>>>>>
>>>>> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
>>>>>
>>>
>>> It looks to me that the fundamental issue is that, in some situations,
>>> the round_rate implementation can return a rate outside of the
>>> boundaries enforced on a clock.
>>
>> In my limited view, the round_rate callbacks should return a rate within
>> boundaries as output,
> 
> I guess it would be s/should/must/, but yeah, we agree.
> 
>> but can take a rate outside of boundaries as input.
> 
> I'm not sure what that would change though?
> 
>> Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A
>> consumer dp_video_ref wants a 200MHz rate, its request walks upward through
>> multiplexers and dividers then reaches to VPLL, VPLL receives this 200MHz
>> request and call  zynqmp_pll_round_rate to "round" this out-of-range rate
>> 200MHz to 1600MHz via multiplying by 8. zynqmp_pll_round_rate returns
>> 1600MHz and clk subsystem will call determine callbacks to configure
>> dividers correctly to make sure that dp_video_ref can get an exact rate
>> 200MHz.
> 
> Sounds good to me indeed.
> 
>> But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
>>
>> req->rate = clamp(req->rate, req->min_rate, req->max_rate);
>>
>> before
>>
>> rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
>>
>> This results that .round_rate callbacks lose functionality since they have
>> no chance to pick up a precise rate but only a boundary rate.
>> Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to
>> 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing,
> 
> I'm a bit confused now.
> 
> If I understand your clock topology, you have a PLL, and then a divider
> of 8, and want the final clock to be running at 200MHz?
> 
> If so, the divider should call its parent round/determine_rate with 200
> * 8 MHz = 1600MHz, which is is still inside the boundaries of 1.5-3.0GHz
> and won't be affected?
> 
> Why should the child be affected by the parent boundaries, or the other
> way aroundSorry, I didn't explain the problem clearly.

As below is the vpll clk topology in /sys/kernel/debug/clk/clk_summary 
when reverted "clk: Always clamp the rounded rate".
  clk_name				MHz
  pss_ref_clk                          33333333
     vpll_post_src                     33333333
     vpll_pre_src                      33333333
        vpll_int                       1599999984
           vpll_half                   799999992
              vpll_int_mux             799999992
                 vpll                  799999992
                    dp_video_ref_mux    799999992
                       dp_video_ref_div1    99999999
                          dp_video_ref_div2  99999999
                             dp_video_ref    99999999

When call clk_set_rate(dp_video_ref, 100MHz), there is a 
clk_calc_new_rates request passed from bottom (dp_video_ref) to top 
(vpll_int), every clk will calculate its clk_rate and its 
best_parent_rate. vpll_half will calculate its clk rate is 100MHz and 
its parent clk vpll_int should be 200MHz since vpll_half is a half 
divider. But vpll_int ranges from 1.5GHz~3GHz, so it call 
zynqmp_pll_round_rate to calculate a new rate 1.6GHz for itself via 
200MHz * 8 because 8 is the smallest integer:

zynqmp_pll_round_rate:
        if (200MHz < 1.5GHz) {
                mult = DIV_ROUND_UP(1.5GHz, 200MHz);
                rate = rate * mult;

The question is that implementation of round_rate callbacks for some clk 
drivers have abilities to round a out-of-range rate req to a 
inside-range rate:

clk_core_determine_round_nolock:
         rate(1.6GHz) = core->ops->round_rate("vpll_int", req->rate(200MHz),
                           &req->best_parent_rate);

But the commit 948fb096 ("clk: Always clamp the rounded rate") breaks 
this like:
clk_core_determine_round_nolock:
	req->rate (1.5GHz) = clamp(req->rate (200MHz), req->min_rate (1.5GHz), 
req->max_rate (3GHz));
	......
	rate(1.5GHz) = core->ops->round_rate("vpll_int", req->rate(1.5GHz),
                           &req->best_parent_rate);


Thanks,
Quanyang
> Maxime
Maxime Ripard Oct. 10, 2022, 12:49 p.m. UTC | #9
On Mon, Oct 10, 2022 at 08:12:08PM +0800, Quanyang Wang wrote:
> Hi Maxime,
> 
> On 10/10/22 16:49, Maxime Ripard wrote:
> > Hi,
> > 
> > On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
> > > On 10/1/22 18:40, Maxime Ripard wrote:
> > > > Hi
> > > > 
> > > > On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
> > > > > +Maxime
> > > > > 
> > > > > Quoting Quanyang Wang (2022-09-28 18:05:10)
> > > > > > Hi Laurent,
> > > > > > 
> > > > > > I have sent a patch as below to fix this issue which set rate failed and
> > > > > > it's in linux-next repo now.
> > > > > > 
> > > > > > https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> > > > > > 
> > > > 
> > > > It looks to me that the fundamental issue is that, in some situations,
> > > > the round_rate implementation can return a rate outside of the
> > > > boundaries enforced on a clock.
> > > 
> > > In my limited view, the round_rate callbacks should return a rate within
> > > boundaries as output,
> > 
> > I guess it would be s/should/must/, but yeah, we agree.
> > 
> > > but can take a rate outside of boundaries as input.
> > 
> > I'm not sure what that would change though?
> > 
> > > Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A
> > > consumer dp_video_ref wants a 200MHz rate, its request walks upward through
> > > multiplexers and dividers then reaches to VPLL, VPLL receives this 200MHz
> > > request and call  zynqmp_pll_round_rate to "round" this out-of-range rate
> > > 200MHz to 1600MHz via multiplying by 8. zynqmp_pll_round_rate returns
> > > 1600MHz and clk subsystem will call determine callbacks to configure
> > > dividers correctly to make sure that dp_video_ref can get an exact rate
> > > 200MHz.
> > 
> > Sounds good to me indeed.
> > 
> > > But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
> > > 
> > > req->rate = clamp(req->rate, req->min_rate, req->max_rate);
> > > 
> > > before
> > > 
> > > rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
> > > 
> > > This results that .round_rate callbacks lose functionality since they have
> > > no chance to pick up a precise rate but only a boundary rate.
> > > Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to
> > > 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing,
> > 
> > I'm a bit confused now.
> > 
> > If I understand your clock topology, you have a PLL, and then a divider
> > of 8, and want the final clock to be running at 200MHz?
> > 
> > If so, the divider should call its parent round/determine_rate with 200
> > * 8 MHz = 1600MHz, which is is still inside the boundaries of 1.5-3.0GHz
> > and won't be affected?
> > 
> > Why should the child be affected by the parent boundaries, or the other
> > way around
>
> Sorry, I didn't explain the problem clearly.
> 
> As below is the vpll clk topology in /sys/kernel/debug/clk/clk_summary when
> reverted "clk: Always clamp the rounded rate".
>  clk_name				MHz
>  pss_ref_clk                          33333333
>     vpll_post_src                     33333333
>     vpll_pre_src                      33333333
>        vpll_int                       1599999984
>           vpll_half                   799999992
>              vpll_int_mux             799999992
>                 vpll                  799999992
>                    dp_video_ref_mux    799999992
>                       dp_video_ref_div1    99999999
>                          dp_video_ref_div2  99999999
>                             dp_video_ref    99999999

I couldn't find any of these clocks by grepping in the kernel code, are
they upstream?

> When call clk_set_rate(dp_video_ref, 100MHz), there is a clk_calc_new_rates
> request passed from bottom (dp_video_ref) to top (vpll_int), every clk will
> calculate its clk_rate and its best_parent_rate. vpll_half will calculate
> its clk rate is 100MHz and its parent clk vpll_int should be 200MHz since
> vpll_half is a half divider. But vpll_int ranges from 1.5GHz~3GHz

Still, I'm not entirely sure what's going on. If the only divider we
have is vpll_half which halves the rate, and we want 100MHz on
dp_video_ref, then vpll_int should provide 200MHz? Why would we increase
it to 1.6GHz? I get that the range of operating frequencies for vpll_int
is 1.5-3GHz, but I don't understand how we could end up with 100MHz on
dp_video_ref with 1.6GHz for that PLL. Or the other way around, why we
want that * 8 in the first place for vpll_int.

Maxime
Quanyang Wang Oct. 11, 2022, 3:11 a.m. UTC | #10
Hi Maxime,

On 10/10/22 20:49, Maxime Ripard wrote:
> On Mon, Oct 10, 2022 at 08:12:08PM +0800, Quanyang Wang wrote:
>> Hi Maxime,
>>
>> On 10/10/22 16:49, Maxime Ripard wrote:
>>> Hi,
>>>
>>> On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
>>>> On 10/1/22 18:40, Maxime Ripard wrote:
>>>>> Hi
>>>>>
>>>>> On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
>>>>>> +Maxime
>>>>>>
>>>>>> Quoting Quanyang Wang (2022-09-28 18:05:10)
>>>>>>> Hi Laurent,
>>>>>>>
>>>>>>> I have sent a patch as below to fix this issue which set rate failed and
>>>>>>> it's in linux-next repo now.
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
>>>>>>>
>>>>>
>>>>> It looks to me that the fundamental issue is that, in some situations,
>>>>> the round_rate implementation can return a rate outside of the
>>>>> boundaries enforced on a clock.
>>>>
>>>> In my limited view, the round_rate callbacks should return a rate within
>>>> boundaries as output,
>>>
>>> I guess it would be s/should/must/, but yeah, we agree.
>>>
>>>> but can take a rate outside of boundaries as input.
>>>
>>> I'm not sure what that would change though?
>>>
>>>> Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A
>>>> consumer dp_video_ref wants a 200MHz rate, its request walks upward through
>>>> multiplexers and dividers then reaches to VPLL, VPLL receives this 200MHz
>>>> request and call  zynqmp_pll_round_rate to "round" this out-of-range rate
>>>> 200MHz to 1600MHz via multiplying by 8. zynqmp_pll_round_rate returns
>>>> 1600MHz and clk subsystem will call determine callbacks to configure
>>>> dividers correctly to make sure that dp_video_ref can get an exact rate
>>>> 200MHz.
>>>
>>> Sounds good to me indeed.
>>>
>>>> But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
>>>>
>>>> req->rate = clamp(req->rate, req->min_rate, req->max_rate);
>>>>
>>>> before
>>>>
>>>> rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
>>>>
>>>> This results that .round_rate callbacks lose functionality since they have
>>>> no chance to pick up a precise rate but only a boundary rate.
>>>> Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to
>>>> 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing,
>>>
>>> I'm a bit confused now.
>>>
>>> If I understand your clock topology, you have a PLL, and then a divider
>>> of 8, and want the final clock to be running at 200MHz?
>>>
>>> If so, the divider should call its parent round/determine_rate with 200
>>> * 8 MHz = 1600MHz, which is is still inside the boundaries of 1.5-3.0GHz
>>> and won't be affected?
>>>
>>> Why should the child be affected by the parent boundaries, or the other
>>> way around
>>
>> Sorry, I didn't explain the problem clearly.
>>
>> As below is the vpll clk topology in /sys/kernel/debug/clk/clk_summary when
>> reverted "clk: Always clamp the rounded rate".
>>   clk_name				MHz
>>   pss_ref_clk                          33333333
>>      vpll_post_src                     33333333
>>      vpll_pre_src                      33333333
>>         vpll_int                       1599999984
>>            vpll_half                   799999992
>>               vpll_int_mux             799999992
>>                  vpll                  799999992
>>                     dp_video_ref_mux    799999992
>>                        dp_video_ref_div1    99999999
>>                           dp_video_ref_div2  99999999
>>                              dp_video_ref    99999999
> 
> I couldn't find any of these clocks by grepping in the kernel code, are
> they upstream?
> 
>> When call clk_set_rate(dp_video_ref, 100MHz), there is a clk_calc_new_rates
>> request passed from bottom (dp_video_ref) to top (vpll_int), every clk will
>> calculate its clk_rate and its best_parent_rate. vpll_half will calculate
>> its clk rate is 100MHz and its parent clk vpll_int should be 200MHz since
>> vpll_half is a half divider. But vpll_int ranges from 1.5GHz~3GHz
> 
> Still, I'm not entirely sure what's going on. If the only divider we
> have is vpll_half which halves the rate, and we want 100MHz on
> dp_video_ref, then vpll_int should provide 200MHz? Why would we increase
> it to 1.6GHz? I get that the range of operating frequencies for vpll_int
> is 1.5-3GHz, but I don't understand how we could end up with 100MHz on
> dp_video_ref with 1.6GHz for that PLL. Or the other way around, why we
> want that * 8 in the first place for vpll_int.
Oh, I think I see what's wrong. It's because the children clocks of 
vpll_int have the wrong rate range. The commit 948fb0969eae8 makes sense 
and my understanding was wrong.

The clk vpll_int sets the rate range by the function:
	clk_hw_set_rate_range(hw, 1.5GHz, 3.0GHz);
But the function clk_hw_set_rate_range just set the rate range for 
vpll_int, not cascade the limitation for its children clks. This results 
that the its children clks still ranges 0~ULONG_MAX. When a 100MHz rate 
request is raised from the bottom "dp_video_ref", every clk in the tree 
doesn't modify it and just pass it to its parent clk since 100MHz is in 
its rate range 0~ULONG_MAX. Then vpll_int receive a 200MHz rate request 
from vpll_half which will set itself to be 100MHz. But in fact, 
vpll_half rate range should be 1.5GHz/2 ~ 3GHz/2. The same to the clk 
dp_video_ref_mux. And the divider dp_video_ref_div1 should range from 
1.5GHz/2/63 ~ 3GHz/2/1.

In my limited view, there is a relation between clocks that is 
muxes/dividers should inherit the limitation from its parent clk. But 
now it seems that the rate ranges are isolating between clks. Is there a 
way that can re-set the rate range for children clks when its parent clk 
re-set the rate range?

Thanks,
Quanyang
> 
> Maxime
Maxime Ripard Oct. 11, 2022, 12:27 p.m. UTC | #11
Hi,

On Tue, Oct 11, 2022 at 11:11:31AM +0800, Quanyang Wang wrote:
> On 10/10/22 20:49, Maxime Ripard wrote:
> > On Mon, Oct 10, 2022 at 08:12:08PM +0800, Quanyang Wang wrote:
> > > Hi Maxime,
> > > 
> > > On 10/10/22 16:49, Maxime Ripard wrote:
> > > > Hi,
> > > > 
> > > > On Sun, Oct 02, 2022 at 10:17:24AM +0800, Quanyang Wang wrote:
> > > > > On 10/1/22 18:40, Maxime Ripard wrote:
> > > > > > Hi
> > > > > > 
> > > > > > On Fri, Sep 30, 2022 at 05:05:01PM -0700, Stephen Boyd wrote:
> > > > > > > +Maxime
> > > > > > > 
> > > > > > > Quoting Quanyang Wang (2022-09-28 18:05:10)
> > > > > > > > Hi Laurent,
> > > > > > > > 
> > > > > > > > I have sent a patch as below to fix this issue which set rate failed and
> > > > > > > > it's in linux-next repo now.
> > > > > > > > 
> > > > > > > > https://lore.kernel.org/linux-arm-kernel/20220826142030.213805-1-quanyang.wang@windriver.com/T/
> > > > > > > > 
> > > > > > 
> > > > > > It looks to me that the fundamental issue is that, in some situations,
> > > > > > the round_rate implementation can return a rate outside of the
> > > > > > boundaries enforced on a clock.
> > > > > 
> > > > > In my limited view, the round_rate callbacks should return a rate within
> > > > > boundaries as output,
> > > > 
> > > > I guess it would be s/should/must/, but yeah, we agree.
> > > > 
> > > > > but can take a rate outside of boundaries as input.
> > > > 
> > > > I'm not sure what that would change though?
> > > > 
> > > > > Take Xilinx Zynqmp for instance, VPLL's rate range is 1.5GHz~3GHz. A
> > > > > consumer dp_video_ref wants a 200MHz rate, its request walks upward through
> > > > > multiplexers and dividers then reaches to VPLL, VPLL receives this 200MHz
> > > > > request and call  zynqmp_pll_round_rate to "round" this out-of-range rate
> > > > > 200MHz to 1600MHz via multiplying by 8. zynqmp_pll_round_rate returns
> > > > > 1600MHz and clk subsystem will call determine callbacks to configure
> > > > > dividers correctly to make sure that dp_video_ref can get an exact rate
> > > > > 200MHz.
> > > > 
> > > > Sounds good to me indeed.
> > > > 
> > > > > But the commit 948fb0969eae8 ("clk: Always clamp the rounded rate") adds
> > > > > 
> > > > > req->rate = clamp(req->rate, req->min_rate, req->max_rate);
> > > > > 
> > > > > before
> > > > > 
> > > > > rate = core->ops->round_rate(core->hw, req->rate,&req->best_parent_rate);
> > > > > 
> > > > > This results that .round_rate callbacks lose functionality since they have
> > > > > no chance to pick up a precise rate but only a boundary rate.
> > > > > Still for Xilinx Zynqmp, the 200MHz rate request to PLL will be set to
> > > > > 1500MHz by clamp function and then zynqmp_pll_round_rate does nothing,
> > > > 
> > > > I'm a bit confused now.
> > > > 
> > > > If I understand your clock topology, you have a PLL, and then a divider
> > > > of 8, and want the final clock to be running at 200MHz?
> > > > 
> > > > If so, the divider should call its parent round/determine_rate with 200
> > > > * 8 MHz = 1600MHz, which is is still inside the boundaries of 1.5-3.0GHz
> > > > and won't be affected?
> > > > 
> > > > Why should the child be affected by the parent boundaries, or the other
> > > > way around
> > > 
> > > Sorry, I didn't explain the problem clearly.
> > > 
> > > As below is the vpll clk topology in /sys/kernel/debug/clk/clk_summary when
> > > reverted "clk: Always clamp the rounded rate".
> > >   clk_name				MHz
> > >   pss_ref_clk                          33333333
> > >      vpll_post_src                     33333333
> > >      vpll_pre_src                      33333333
> > >         vpll_int                       1599999984
> > >            vpll_half                   799999992
> > >               vpll_int_mux             799999992
> > >                  vpll                  799999992
> > >                     dp_video_ref_mux    799999992
> > >                        dp_video_ref_div1    99999999
> > >                           dp_video_ref_div2  99999999
> > >                              dp_video_ref    99999999
> > 
> > I couldn't find any of these clocks by grepping in the kernel code, are
> > they upstream?
> > 
> > > When call clk_set_rate(dp_video_ref, 100MHz), there is a clk_calc_new_rates
> > > request passed from bottom (dp_video_ref) to top (vpll_int), every clk will
> > > calculate its clk_rate and its best_parent_rate. vpll_half will calculate
> > > its clk rate is 100MHz and its parent clk vpll_int should be 200MHz since
> > > vpll_half is a half divider. But vpll_int ranges from 1.5GHz~3GHz
> > 
> > Still, I'm not entirely sure what's going on. If the only divider we
> > have is vpll_half which halves the rate, and we want 100MHz on
> > dp_video_ref, then vpll_int should provide 200MHz? Why would we increase
> > it to 1.6GHz? I get that the range of operating frequencies for vpll_int
> > is 1.5-3GHz, but I don't understand how we could end up with 100MHz on
> > dp_video_ref with 1.6GHz for that PLL. Or the other way around, why we
> > want that * 8 in the first place for vpll_int.
>
> Oh, I think I see what's wrong. It's because the children clocks of vpll_int
> have the wrong rate range. The commit 948fb0969eae8 makes sense and my
> understanding was wrong.
> 
> The clk vpll_int sets the rate range by the function:
> 	clk_hw_set_rate_range(hw, 1.5GHz, 3.0GHz);
> But the function clk_hw_set_rate_range just set the rate range for vpll_int,
> not cascade the limitation for its children clks. This results that the its
> children clks still ranges 0~ULONG_MAX.

Yeah, indeed the range isn't propagated to the children yet. This wasn't
the case before my series either, so I wouldn't consider it a
regression.

> When a 100MHz rate request is raised from the bottom "dp_video_ref",
> every clk in the tree doesn't modify it and just pass it to its parent
> clk since 100MHz is in its rate range 0~ULONG_MAX.

From a current-state point of view, that makes sense.

> Then vpll_int receive a 200MHz rate request from vpll_half which will
> set itself to be 100MHz.

Still makes sense :)

> But in fact, vpll_half rate range should be 1.5GHz/2 ~ 3GHz/2. The
> same to the clk dp_video_ref_mux. And the divider dp_video_ref_div1
> should range from 1.5GHz/2/63 ~ 3GHz/2/1.

I still don't get where exactly the 8 multiplier is enforced? Is it in
vpll_int? If so, then why is it reporting a rate of 1.5-3.0 GHz, whereas
its child clock that is supposed to halve the rate has 100MHz?

> In my limited view, there is a relation between clocks that is
> muxes/dividers should inherit the limitation from its parent clk. But now it
> seems that the rate ranges are isolating between clks. Is there a way that
> can re-set the rate range for children clks when its parent clk re-set the
> rate range?

We can't really do that in the framework, because we can't really know
the capabilities of the clocks at a generic level using any of the
clocks operations. Indeed some clocks won't affect the rate ("pure"
gate, muxes), and some will.

Figuring out the rate range for those that don't affect it is trivial,
but for those who do you can't know how it affects the rate. Plus, PLL
will typically have a narrower range of operation than the extremum you
can achieve by using the larger divider and smaller multiplier for
example.

So I think you can't really expect an obvious relationship between the
parent range and the child range. So it should be reported by the
driver, like you did, by calling clk_hw_set_rate_range().

Maxime
diff mbox series

Patch

diff --git a/drivers/clk/zynqmp/pll.c b/drivers/clk/zynqmp/pll.c
index 91a6b4cc910e..be6fa44a21e0 100644
--- a/drivers/clk/zynqmp/pll.c
+++ b/drivers/clk/zynqmp/pll.c
@@ -120,6 +120,10 @@  static long zynqmp_pll_round_rate(struct clk_hw *hw, unsigned long rate,
 	}
 
 	fbdiv = DIV_ROUND_CLOSEST(rate, *prate);
+	if (*prate * fbdiv < PS_PLL_VCO_MIN)
+		fbdiv++;
+	if (*prate * fbdiv > PS_PLL_VCO_MAX)
+		fbdiv--;
 	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
 	return *prate * fbdiv;
 }
@@ -208,6 +212,10 @@  static int zynqmp_pll_set_rate(struct clk_hw *hw, unsigned long rate,
 	}
 
 	fbdiv = DIV_ROUND_CLOSEST(rate, parent_rate);
+	if (parent_rate * fbdiv < PS_PLL_VCO_MIN)
+		fbdiv++;
+	else if (parent_rate * fbdiv > PS_PLL_VCO_MAX)
+		fbdiv--;
 	fbdiv = clamp_t(u32, fbdiv, PLL_FBDIV_MIN, PLL_FBDIV_MAX);
 	ret = zynqmp_pm_clock_setdivider(clk_id, fbdiv);
 	if (ret)