Message ID | 1654021066-13341-1-git-send-email-quic_vnivarth@quicinc.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | tty: serial: qcom-geni-serial: minor fixes to get_clk_div_rate() | expand |
On Tue, May 31, 2022 at 11:47:46PM +0530, Vijaya Krishna Nivarthi wrote: > Add missing initialisation and correct type casting > > Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> > --- > drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - You did not specify a description of why the patch is needed, or possibly, any description at all, in the email body. Please read the section entitled "The canonical patch format" in the kernel file, Documentation/SubmittingPatches for what is needed in order to properly describe the change. - You did not write a descriptive Subject: for the patch, allowing Greg, and everyone else, to know what this patch is all about. Please read the section entitled "The canonical patch format" in the kernel file, Documentation/SubmittingPatches for what a proper Subject: line should look like. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot
Hi, On Tue, May 31, 2022 at 11:18 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Add missing initialisation and correct type casting > > Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> > --- > drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > index 4733a23..08f3ad4 100644 > --- a/drivers/tty/serial/qcom_geni_serial.c > +++ b/drivers/tty/serial/qcom_geni_serial.c > @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) > static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, > unsigned int sampling_rate, unsigned int *clk_div) > { > - unsigned long ser_clk; > + unsigned long ser_clk = 0; In this patch it's not at all obvious why you'd need to init to 0. I think the "for loop" is guaranteed to run at least once because "max_div" is known at compile time. ...and currently each time through the "for" loop you'll always set "ser_clk". I think in a future patch you'll want to _remove_ this from the for loop: if (!prev) ser_clk = freq; ...and _that's_ when you should init "ser_clk" to 0. Until then I'd leave it as uninitialized... Honestly, I'd throw all the fixes into one series, too. > unsigned long desired_clk; > unsigned long freq, prev; > unsigned long div, maxdiv; > - int64_t mult; > + unsigned long long mult; > > desired_clk = baud * sampling_rate; > if (!desired_clk) { > @@ -959,8 +959,8 @@ static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, > prev = 0; > > for (div = 1; div <= maxdiv; div++) { > - mult = div * desired_clk; > - if (mult > ULONG_MAX) > + mult = (unsigned long long)div * (unsigned long long)desired_clk; I think you only need to cast one of the two. The other will be up-cast automatically. > + if (mult > (unsigned long long)ULONG_MAX) I don't think you need this cast. As far as I know the C language will "upcast" to the larger of the two types. -Doug
Hi, On 6/1/2022 12:58 AM, Doug Anderson wrote: > Hi, > > On Tue, May 31, 2022 at 11:18 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Add missing initialisation and correct type casting >> >> Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> >> --- >> drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c >> index 4733a23..08f3ad4 100644 >> --- a/drivers/tty/serial/qcom_geni_serial.c >> +++ b/drivers/tty/serial/qcom_geni_serial.c >> @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) >> static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, >> unsigned int sampling_rate, unsigned int *clk_div) >> { >> - unsigned long ser_clk; >> + unsigned long ser_clk = 0; > In this patch it's not at all obvious why you'd need to init to 0. I > think the "for loop" is guaranteed to run at least once because > "max_div" is known at compile time. ...and currently each time through > the "for" loop you'll always set "ser_clk". Ok, I realised we will never break out of for loop exceeding ULONG_MAX in 1st pass, so yes ser_clk will always be set. > I think in a future patch you'll want to _remove_ this from the for loop: > > if (!prev) > ser_clk = freq; Intent is to save (and use) 1st freq if we cannot find an exact divider. Isn't it ok? For example please find debug output for a required frequency of 51.2MHz. We try dividers 1, 2, 3 and end up with 52.1MHz the first result. [ 18.815432] 20220509 get_clk_div_rate desired_clk:51200000 [ 18.821081] 20220509 get_clk_div_rate maxdiv:4095 [ 18.825924] 20220509 get_clk_div_rate div:1 [ 18.830239] 20220509 get_clk_div_rate freq:52174000 [ 18.835288] 20220509 get_clk_div_rate div:2 [ 18.839628] 20220509 get_clk_div_rate freq:100000000 [ 18.844794] 20220509 get_clk_div_rate div:3 [ 18.849119] 20220509 get_clk_div_rate freq:100000000 [ 18.854254] 20220509 get_clk_div_rate reached max frequency breaking... [ 18.861072] 20220509 get_clk_div_rate clk_div=1, ser_clk=52174000 The behaviour was same earlier too when root_freq table was present. The table did contain 51.2MHz and we would exit with same but on call to clk_set_rate(51.2MHz) we were ending up with 52.1MHz > > ...and _that's_ when you should init "ser_clk" to 0. Until then I'd > leave it as uninitialized... > > Honestly, I'd throw all the fixes into one series, too. My concern was if there would be a requirement to split the changes. Will put in all in 1 series with Fixes tag. > > >> unsigned long desired_clk; >> unsigned long freq, prev; >> unsigned long div, maxdiv; >> - int64_t mult; >> + unsigned long long mult; >> >> desired_clk = baud * sampling_rate; >> if (!desired_clk) { >> @@ -959,8 +959,8 @@ static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, >> prev = 0; >> >> for (div = 1; div <= maxdiv; div++) { >> - mult = div * desired_clk; >> - if (mult > ULONG_MAX) >> + mult = (unsigned long long)div * (unsigned long long)desired_clk; > I think you only need to cast one of the two. The other will be > up-cast automatically. Will change. > > >> + if (mult > (unsigned long long)ULONG_MAX) > I don't think you need this cast. As far as I know the C language will > "upcast" to the larger of the two types. Will change. > > > -Doug Thank you. -Vijay/
Hi, On Wed, Jun 1, 2022 at 3:46 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Hi, > > On 6/1/2022 12:58 AM, Doug Anderson wrote: > > Hi, > > > > On Tue, May 31, 2022 at 11:18 AM Vijaya Krishna Nivarthi > > <quic_vnivarth@quicinc.com> wrote: > >> Add missing initialisation and correct type casting > >> > >> Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> > >> --- > >> drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- > >> 1 file changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > >> index 4733a23..08f3ad4 100644 > >> --- a/drivers/tty/serial/qcom_geni_serial.c > >> +++ b/drivers/tty/serial/qcom_geni_serial.c > >> @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) > >> static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, > >> unsigned int sampling_rate, unsigned int *clk_div) > >> { > >> - unsigned long ser_clk; > >> + unsigned long ser_clk = 0; > > In this patch it's not at all obvious why you'd need to init to 0. I > > think the "for loop" is guaranteed to run at least once because > > "max_div" is known at compile time. ...and currently each time through > > the "for" loop you'll always set "ser_clk". > > Ok, I realised we will never break out of for loop exceeding ULONG_MAX > in 1st pass, so yes ser_clk will always be set. > > > I think in a future patch you'll want to _remove_ this from the for loop: > > > > if (!prev) > > ser_clk = freq; > > Intent is to save (and use) 1st freq if we cannot find an exact divider. > > Isn't it ok? > > For example please find debug output for a required frequency of 51.2MHz. > > We try dividers 1, 2, 3 and end up with 52.1MHz the first result. > > [ 18.815432] 20220509 get_clk_div_rate desired_clk:51200000 > [ 18.821081] 20220509 get_clk_div_rate maxdiv:4095 > [ 18.825924] 20220509 get_clk_div_rate div:1 > [ 18.830239] 20220509 get_clk_div_rate freq:52174000 > [ 18.835288] 20220509 get_clk_div_rate div:2 > [ 18.839628] 20220509 get_clk_div_rate freq:100000000 > [ 18.844794] 20220509 get_clk_div_rate div:3 > [ 18.849119] 20220509 get_clk_div_rate freq:100000000 > [ 18.854254] 20220509 get_clk_div_rate reached max frequency breaking... > [ 18.861072] 20220509 get_clk_div_rate clk_div=1, ser_clk=52174000 > > The behaviour was same earlier too when root_freq table was present. Are you certain about the behavior being the same earlier? Before commit c2194bc999d4 ("tty: serial: qcom-geni-serial: Remove uart frequency table..."), the behavior was that get_clk_cfg() would return 0 if there was no exact match. Then get_clk_div_rate() would see this 0 and print an error and return. Then the rest of qcom_geni_serial_set_termios() would do nothing at all. Ah, or I guess what you're saying is that the table historically contained "rounded" rates but that clk_round_rate() isn't returning nice round rates. OK, but if we truly want to support an inexact match, you'd want to pick the rate that reduces the error, not just pick the first one. In other words, something like this (untested): freq = clk_round_rate(clk, mult); diff = abs(((long)mult - freq) / div); if (diff < best_diff) { best_diff = diff; ser_clk = freq; best_div = div; } Why do you need this? Imagine that the desired rate was 50000001 or 49999999. The closest match would be to use the rate 100000000 and divide it by 2. ...but your existing algorithm would just arbitrarily pick the first rate returned. NOTE also that you could end up with a slightly higher or slightly lower clock than requested, right? So it's important to: * Do signed math when comparing. * Save the "div" instead of trying to recompute it at the end. > The table did contain 51.2MHz and we would exit with same but on call to > clk_set_rate(51.2MHz) we were ending up with 52.1MHz > > > > > ...and _that's_ when you should init "ser_clk" to 0. Until then I'd > > leave it as uninitialized... > > > > Honestly, I'd throw all the fixes into one series, too. > > My concern was if there would be a requirement to split the changes. > > Will put in all in 1 series with Fixes tag. > > > > > > >> unsigned long desired_clk; > >> unsigned long freq, prev; > >> unsigned long div, maxdiv; > >> - int64_t mult; > >> + unsigned long long mult; > >> > >> desired_clk = baud * sampling_rate; > >> if (!desired_clk) { > >> @@ -959,8 +959,8 @@ static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, > >> prev = 0; > >> > >> for (div = 1; div <= maxdiv; div++) { > >> - mult = div * desired_clk; > >> - if (mult > ULONG_MAX) > >> + mult = (unsigned long long)div * (unsigned long long)desired_clk; > > I think you only need to cast one of the two. The other will be > > up-cast automatically. > Will change. > > > > > >> + if (mult > (unsigned long long)ULONG_MAX) > > I don't think you need this cast. As far as I know the C language will > > "upcast" to the larger of the two types. > Will change. > > > > > > -Doug > > Thank you. > > -Vijay/ >
Hi, On 6/1/2022 9:03 PM, Doug Anderson wrote: > Hi, > > On Wed, Jun 1, 2022 at 3:46 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Hi, >> >> On 6/1/2022 12:58 AM, Doug Anderson wrote: >>> Hi, >>> >>> On Tue, May 31, 2022 at 11:18 AM Vijaya Krishna Nivarthi >>> <quic_vnivarth@quicinc.com> wrote: >>>> Add missing initialisation and correct type casting >>>> >>>> Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> >>>> --- >>>> drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- >>>> 1 file changed, 4 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c >>>> index 4733a23..08f3ad4 100644 >>>> --- a/drivers/tty/serial/qcom_geni_serial.c >>>> +++ b/drivers/tty/serial/qcom_geni_serial.c >>>> @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) >>>> static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, >>>> unsigned int sampling_rate, unsigned int *clk_div) >>>> { >>>> - unsigned long ser_clk; >>>> + unsigned long ser_clk = 0; >>> In this patch it's not at all obvious why you'd need to init to 0. I >>> think the "for loop" is guaranteed to run at least once because >>> "max_div" is known at compile time. ...and currently each time through >>> the "for" loop you'll always set "ser_clk". >> Ok, I realised we will never break out of for loop exceeding ULONG_MAX >> in 1st pass, so yes ser_clk will always be set. >> >>> I think in a future patch you'll want to _remove_ this from the for loop: >>> >>> if (!prev) >>> ser_clk = freq; >> Intent is to save (and use) 1st freq if we cannot find an exact divider. >> >> Isn't it ok? >> >> For example please find debug output for a required frequency of 51.2MHz. >> >> We try dividers 1, 2, 3 and end up with 52.1MHz the first result. >> >> [ 18.815432] 20220509 get_clk_div_rate desired_clk:51200000 >> [ 18.821081] 20220509 get_clk_div_rate maxdiv:4095 >> [ 18.825924] 20220509 get_clk_div_rate div:1 >> [ 18.830239] 20220509 get_clk_div_rate freq:52174000 >> [ 18.835288] 20220509 get_clk_div_rate div:2 >> [ 18.839628] 20220509 get_clk_div_rate freq:100000000 >> [ 18.844794] 20220509 get_clk_div_rate div:3 >> [ 18.849119] 20220509 get_clk_div_rate freq:100000000 >> [ 18.854254] 20220509 get_clk_div_rate reached max frequency breaking... >> [ 18.861072] 20220509 get_clk_div_rate clk_div=1, ser_clk=52174000 >> >> The behaviour was same earlier too when root_freq table was present. > Are you certain about the behavior being the same earlier? Before > commit c2194bc999d4 ("tty: serial: qcom-geni-serial: Remove uart > frequency table..."), the behavior was that get_clk_cfg() would return > 0 if there was no exact match. Then get_clk_div_rate() would see this > 0 and print an error and return. Then the rest of > qcom_geni_serial_set_termios() would do nothing at all. > > Ah, or I guess what you're saying is that the table historically > contained "rounded" rates but that clk_round_rate() isn't returning > nice round rates. OK, but if we truly want to support an inexact > match, you'd want to pick the rate that reduces the error, not just > pick the first one. In other words, something like this (untested): > > freq = clk_round_rate(clk, mult); > diff = abs(((long)mult - freq) / div); > if (diff < best_diff) { > best_diff = diff; > ser_clk = freq; > best_div = div; > } I am not sure if its required that freq is a multiple of best_div now that we don't have a multiple of desired_clk anyway. If it is indeed required, with above patch its not guaranteed and finding best_div gets little more complicated? We may have to loop through all available frequencies and dividers? PFB, a proposed implementation with a 2nd loop. Its tested but I haven't been able to optimise it further because it misses corner theoretical cases when I try maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; prev = 0; /* run through quicker loop anticipating to find an exact match */ for (div = 1; div <= maxdiv; div++) { mult = (unsigned long long)div * desired_clk; if (mult > ULONG_MAX) break; freq = clk_round_rate(clk, max((unsigned long)mult, prev+1)); if (!(freq % desired_clk)) { *clk_div = freq / desired_clk; return freq; } if (prev && prev == freq) break; prev = freq; } pr_warn("Can't find exact match frequency and divider\n"); /* * this scenario ideally should be a rare occurrence * run through all frequencies and find closest match * note that it cannot get better than a difference of 1 */ freq = 0; best_diff = ULONG_MAX; while (true) { prev = freq; freq = clk_round_rate(clk, freq+1); if (freq == prev) break; for (div = 1; div <= maxdiv; div++) { if (!(freq % div)) { diff = abs((long)(freq/div) - desired_clk); if (diff < best_diff) { best_diff = diff; ser_clk = freq; *clk_div = div; if (diff == 1) break; } } } } return ser_clk; } > > Why do you need this? Imagine that the desired rate was 50000001 or > 49999999. The closest match would be to use the rate 100000000 and > divide it by 2. ...but your existing algorithm would just arbitrarily > pick the first rate returned. > > NOTE also that you could end up with a slightly higher or slightly > lower clock than requested, right? So it's important to: > * Do signed math when comparing. > * Save the "div" instead of trying to recompute it at the end. > > >> The table did contain 51.2MHz and we would exit with same but on call to >> clk_set_rate(51.2MHz) we were ending up with 52.1MHz >> >>> ...and _that's_ when you should init "ser_clk" to 0. Until then I'd >>> leave it as uninitialized... >>> >>> Honestly, I'd throw all the fixes into one series, too. >> My concern was if there would be a requirement to split the changes. >> >> Will put in all in 1 series with Fixes tag. >> >>> >>>> unsigned long desired_clk; >>>> unsigned long freq, prev; >>>> unsigned long div, maxdiv; >>>> - int64_t mult; >>>> + unsigned long long mult; >>>> >>>> desired_clk = baud * sampling_rate; >>>> if (!desired_clk) { >>>> @@ -959,8 +959,8 @@ static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, >>>> prev = 0; >>>> >>>> for (div = 1; div <= maxdiv; div++) { >>>> - mult = div * desired_clk; >>>> - if (mult > ULONG_MAX) >>>> + mult = (unsigned long long)div * (unsigned long long)desired_clk; >>> I think you only need to cast one of the two. The other will be >>> up-cast automatically. >> Will change. >>> >>>> + if (mult > (unsigned long long)ULONG_MAX) >>> I don't think you need this cast. As far as I know the C language will >>> "upcast" to the larger of the two types. >> Will change. >>> >>> -Doug >> Thank you. >> >> -Vijay/ >>
Hi, On Fri, Jun 3, 2022 at 10:43 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Hi, > > > On 6/1/2022 9:03 PM, Doug Anderson wrote: > > Hi, > > > > On Wed, Jun 1, 2022 at 3:46 AM Vijaya Krishna Nivarthi > > <quic_vnivarth@quicinc.com> wrote: > >> Hi, > >> > >> On 6/1/2022 12:58 AM, Doug Anderson wrote: > >>> Hi, > >>> > >>> On Tue, May 31, 2022 at 11:18 AM Vijaya Krishna Nivarthi > >>> <quic_vnivarth@quicinc.com> wrote: > >>>> Add missing initialisation and correct type casting > >>>> > >>>> Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> > >>>> --- > >>>> drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- > >>>> 1 file changed, 4 insertions(+), 4 deletions(-) > >>>> > >>>> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c > >>>> index 4733a23..08f3ad4 100644 > >>>> --- a/drivers/tty/serial/qcom_geni_serial.c > >>>> +++ b/drivers/tty/serial/qcom_geni_serial.c > >>>> @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) > >>>> static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, > >>>> unsigned int sampling_rate, unsigned int *clk_div) > >>>> { > >>>> - unsigned long ser_clk; > >>>> + unsigned long ser_clk = 0; > >>> In this patch it's not at all obvious why you'd need to init to 0. I > >>> think the "for loop" is guaranteed to run at least once because > >>> "max_div" is known at compile time. ...and currently each time through > >>> the "for" loop you'll always set "ser_clk". > >> Ok, I realised we will never break out of for loop exceeding ULONG_MAX > >> in 1st pass, so yes ser_clk will always be set. > >> > >>> I think in a future patch you'll want to _remove_ this from the for loop: > >>> > >>> if (!prev) > >>> ser_clk = freq; > >> Intent is to save (and use) 1st freq if we cannot find an exact divider. > >> > >> Isn't it ok? > >> > >> For example please find debug output for a required frequency of 51.2MHz. > >> > >> We try dividers 1, 2, 3 and end up with 52.1MHz the first result. > >> > >> [ 18.815432] 20220509 get_clk_div_rate desired_clk:51200000 > >> [ 18.821081] 20220509 get_clk_div_rate maxdiv:4095 > >> [ 18.825924] 20220509 get_clk_div_rate div:1 > >> [ 18.830239] 20220509 get_clk_div_rate freq:52174000 > >> [ 18.835288] 20220509 get_clk_div_rate div:2 > >> [ 18.839628] 20220509 get_clk_div_rate freq:100000000 > >> [ 18.844794] 20220509 get_clk_div_rate div:3 > >> [ 18.849119] 20220509 get_clk_div_rate freq:100000000 > >> [ 18.854254] 20220509 get_clk_div_rate reached max frequency breaking... > >> [ 18.861072] 20220509 get_clk_div_rate clk_div=1, ser_clk=52174000 > >> > >> The behaviour was same earlier too when root_freq table was present. > > Are you certain about the behavior being the same earlier? Before > > commit c2194bc999d4 ("tty: serial: qcom-geni-serial: Remove uart > > frequency table..."), the behavior was that get_clk_cfg() would return > > 0 if there was no exact match. Then get_clk_div_rate() would see this > > 0 and print an error and return. Then the rest of > > qcom_geni_serial_set_termios() would do nothing at all. > > > > Ah, or I guess what you're saying is that the table historically > > contained "rounded" rates but that clk_round_rate() isn't returning > > nice round rates. OK, but if we truly want to support an inexact > > match, you'd want to pick the rate that reduces the error, not just > > pick the first one. In other words, something like this (untested): > > > > freq = clk_round_rate(clk, mult); > > diff = abs(((long)mult - freq) / div); > > if (diff < best_diff) { > > best_diff = diff; > > ser_clk = freq; > > best_div = div; > > } > I am not sure if its required that freq is a multiple of best_div now > that we don't have a multiple of desired_clk anyway. How about just this (untested): freq = clk_round_rate(clk, mult); candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); candidate_freq = freq / candidate_div; diff = abs((long)desired_clk - candidate_freq); if (diff < best_diff) { best_diff = diff; ser_clk = freq; best_div = candidate_div; } Here: freq: a freq we can definitely make candidate_div: the best number to divide freq by to get the desired clock. candidate_freq: the frequency we'll end up if we divide freq by candidate_div. We want this to be close to desired_clk. diff: how far away the candidate_freq is away from what we want. best_diff: how far away the best candidate was from what we wanted. ser_clk: What we should pass to clk_set_rate() to get the best candidate. best_div: What we should use as a divider to get the best candidate. -Doug
Hi, On 6/4/2022 12:10 AM, Doug Anderson wrote: > Hi, > > On Fri, Jun 3, 2022 at 10:43 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> >> Ah, or I guess what you're saying is that the table historically >> contained "rounded" rates but that clk_round_rate() isn't returning >> nice round rates. OK, but if we truly want to support an inexact >> match, you'd want to pick the rate that reduces the error, not just >> pick the first one. In other words, something like this (untested): >> >> freq = clk_round_rate(clk, mult); >> diff = abs(((long)mult - freq) / div); >> if (diff < best_diff) { >> best_diff = diff; >> ser_clk = freq; >> best_div = div; >> } >> I am not sure if its required that freq is a multiple of best_div now >> that we don't have a multiple of desired_clk anyway. > How about just this (untested): > > freq = clk_round_rate(clk, mult); > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); > candidate_freq = freq / candidate_div; > diff = abs((long)desired_clk - candidate_freq); > if (diff < best_diff) { > best_diff = diff; > ser_clk = freq; > best_div = candidate_div; > } I am afraid this still doesn't guarantee that ser_clk is a multiple of best_div I tested it with a function simulates clk_round_rate. static unsigned long clk_round_rate_test(struct clk *clk, unsigned long in_freq) { unsigned long root_freq[6] = {105, 204, 303, 402, 501, 602}; int i; for (i = 0; i < 6; i++) { if (root_freq[i] >= in_freq) return root_freq[i]; } return root_freq[6]; } { unsigned long ser_clk; unsigned long desired_clk; unsigned long freq; int div_round_closest; unsigned long div; unsigned long mult; unsigned long candidate_div, candidate_freq; unsigned long diff, best_diff, best_div; unsigned long one; desired_clk = 100; one = 1; best_diff = ULONG_MAX; pr_err("\ndesired_clk-%d\n", desired_clk); for (div = 1; div <= 10; div++) { mult = div * desired_clk; freq = clk_round_rate_test(clk, mult); div_round_closest = DIV_ROUND_CLOSEST(freq, desired_clk); candidate_div = max(one, (unsigned long)div_round_closest); candidate_freq = freq / candidate_div; diff = abs((long)desired_clk - candidate_freq); pr_err("div-%d, mult-%d, freq-%d, div_round_closest-%d, candidate_div-%d, candidate_freq-%d, diff-%d\n", div, mult, freq, div_round_closest, candidate_div, candidate_freq, diff); if (diff < best_diff) { pr_err("This is best so far\n"); best_diff = diff; ser_clk = freq; best_div = candidate_div; } } pr_err("\nbest_diff-%d, ser_clk-%d, best_div-%d\n", best_diff, ser_clk, best_div); } And here is the output [ 17.835167] desired_clk-100 [ 17.839567] div-1, mult-100, freq-105, div_round_closest-1, candidate_div-1, candidate_freq-105, diff-5 [ 17.849220] This is best so far [ 17.852458] div-2, mult-200, freq-204, div_round_closest-2, candidate_div-2, candidate_freq-102, diff-2 [ 17.862104] This is best so far [ 17.865345] div-3, mult-300, freq-303, div_round_closest-3, candidate_div-3, candidate_freq-101, diff-1 [ 17.874995] This is best so far [ 17.878237] div-4, mult-400, freq-402, div_round_closest-4, candidate_div-4, candidate_freq-100, diff-0 [ 17.887882] This is best so far [ 17.891118] div-5, mult-500, freq-501, div_round_closest-5, candidate_div-5, candidate_freq-100, diff-0 [ 17.900770] div-6, mult-600, freq-602, div_round_closest-6, candidate_div-6, candidate_freq-100, diff-0 [ 17.910415] div-7, mult-700, freq-602, div_round_closest-6, candidate_div-6, candidate_freq-100, diff-0 [ 17.920057] div-8, mult-800, freq-602, div_round_closest-6, candidate_div-6, candidate_freq-100, diff-0 [ 17.929703] div-9, mult-900, freq-602, div_round_closest-6, candidate_div-6, candidate_freq-100, diff-0 [ 17.939353] div-10, mult-1000, freq-602, div_round_closest-6, candidate_div-6, candidate_freq-100, diff-0 [ 17.949181] [ 17.949181] best_diff-0, ser_clk-402, best_div-4 Please note that we go past cases when we have an divider that can exactly divide the frequency(105/1, 204/2, 303/3) and end up with one that doesn't. It seems like we need to run through all dividers to find one that can divide output freq from clk_round_rate and among those choose smallest delta. In regular bootup 2nd loop was required only for 51200000. For others 1843200, 153600, 1st loop worked fine. Thank you. > > Here: > > freq: a freq we can definitely make > > candidate_div: the best number to divide freq by to get the desired clock. > > candidate_freq: the frequency we'll end up if we divide freq by > candidate_div. We want this to be close to desired_clk. > > diff: how far away the candidate_freq is away from what we want. > > best_diff: how far away the best candidate was from what we wanted. > > ser_clk: What we should pass to clk_set_rate() to get the best candidate. > > best_div: What we should use as a divider to get the best candidate. > > > -Doug
Hi, On Mon, Jun 6, 2022 at 11:19 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Hi, > > > On 6/4/2022 12:10 AM, Doug Anderson wrote: > > Hi, > > > > On Fri, Jun 3, 2022 at 10:43 AM Vijaya Krishna Nivarthi > > <quic_vnivarth@quicinc.com> wrote: > >> > >> Ah, or I guess what you're saying is that the table historically > >> contained "rounded" rates but that clk_round_rate() isn't returning > >> nice round rates. OK, but if we truly want to support an inexact > >> match, you'd want to pick the rate that reduces the error, not just > >> pick the first one. In other words, something like this (untested): > >> > >> freq = clk_round_rate(clk, mult); > >> diff = abs(((long)mult - freq) / div); > >> if (diff < best_diff) { > >> best_diff = diff; > >> ser_clk = freq; > >> best_div = div; > >> } > >> I am not sure if its required that freq is a multiple of best_div now > >> that we don't have a multiple of desired_clk anyway. > > How about just this (untested): > > > > freq = clk_round_rate(clk, mult); > > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); > > candidate_freq = freq / candidate_div; > > diff = abs((long)desired_clk - candidate_freq); > > if (diff < best_diff) { > > best_diff = diff; > > ser_clk = freq; > > best_div = candidate_div; > > } > > I am afraid this still doesn't guarantee that ser_clk is a multiple of > best_div OK. ...I guess my question would be: does it matter for some reason? "ser_clk" is just a local variable in this function. Who cares if it's not a multiple of best_div? This is why we're keeping track of "best_div" in the first place, so that later in the function instead of: *clk_div = ser_clk / desired_clk; if (!(*clk_div)) *clk_div = 1; You just do: *clk_div = best_div; > I tested it with a function simulates clk_round_rate. > > static unsigned long clk_round_rate_test(struct clk *clk, unsigned long > in_freq) > { > unsigned long root_freq[6] = {105, 204, 303, 402, 501, 602}; > int i; > > for (i = 0; i < 6; i++) { > if (root_freq[i] >= in_freq) > return root_freq[i]; > } > return root_freq[6]; > } > > { > unsigned long ser_clk; > unsigned long desired_clk; > unsigned long freq; > int div_round_closest; > unsigned long div; > unsigned long mult; > unsigned long candidate_div, candidate_freq; > > unsigned long diff, best_diff, best_div; > unsigned long one; > > desired_clk = 100; > one = 1; > best_diff = ULONG_MAX; > pr_err("\ndesired_clk-%d\n", desired_clk); > for (div = 1; div <= 10; div++) { > mult = div * desired_clk; > > freq = clk_round_rate_test(clk, mult); > div_round_closest = DIV_ROUND_CLOSEST(freq, desired_clk); > candidate_div = max(one, (unsigned long)div_round_closest); > candidate_freq = freq / candidate_div; > diff = abs((long)desired_clk - candidate_freq); > pr_err("div-%d, mult-%d, freq-%d, div_round_closest-%d, > candidate_div-%d, candidate_freq-%d, diff-%d\n", > div, mult, freq, div_round_closest, candidate_div, > candidate_freq, diff); > if (diff < best_diff) { > pr_err("This is best so far\n"); > best_diff = diff; > ser_clk = freq; > best_div = candidate_div; > } > } > pr_err("\nbest_diff-%d, ser_clk-%d, best_div-%d\n", > best_diff, ser_clk, best_div); > } > > And here is the output > > [ 17.835167] desired_clk-100 > [ 17.839567] div-1, mult-100, freq-105, div_round_closest-1, > candidate_div-1, candidate_freq-105, diff-5 > [ 17.849220] This is best so far > [ 17.852458] div-2, mult-200, freq-204, div_round_closest-2, > candidate_div-2, candidate_freq-102, diff-2 > [ 17.862104] This is best so far > [ 17.865345] div-3, mult-300, freq-303, div_round_closest-3, > candidate_div-3, candidate_freq-101, diff-1 > [ 17.874995] This is best so far > [ 17.878237] div-4, mult-400, freq-402, div_round_closest-4, > candidate_div-4, candidate_freq-100, diff-0 > [ 17.887882] This is best so far > [ 17.891118] div-5, mult-500, freq-501, div_round_closest-5, > candidate_div-5, candidate_freq-100, diff-0 > [ 17.900770] div-6, mult-600, freq-602, div_round_closest-6, > candidate_div-6, candidate_freq-100, diff-0 > [ 17.910415] div-7, mult-700, freq-602, div_round_closest-6, > candidate_div-6, candidate_freq-100, diff-0 > [ 17.920057] div-8, mult-800, freq-602, div_round_closest-6, > candidate_div-6, candidate_freq-100, diff-0 > [ 17.929703] div-9, mult-900, freq-602, div_round_closest-6, > candidate_div-6, candidate_freq-100, diff-0 > [ 17.939353] div-10, mult-1000, freq-602, div_round_closest-6, > candidate_div-6, candidate_freq-100, diff-0 > [ 17.949181] > [ 17.949181] best_diff-0, ser_clk-402, best_div-4 That doesn't look like a terrible result. I guess nominally 602 is a better approximation, but if we're accepting that we're not going to have an exact rate anyway then maybe being off by that tiny amount doesn't matter and we'd do better with the slow clock (maybe saves power?) > Please note that we go past cases when we have an divider that can > exactly divide the frequency(105/1, 204/2, 303/3) and end up with one > that doesn't. Ah, good point. Luckily that's a 1-line fix, right? -Doug
Hi, On 6/7/2022 1:29 AM, Doug Anderson wrote: > Hi, > > On Mon, Jun 6, 2022 at 11:19 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Hi, >> >> >> On 6/4/2022 12:10 AM, Doug Anderson wrote: >>> Hi, >>> >>> On Fri, Jun 3, 2022 at 10:43 AM Vijaya Krishna Nivarthi >>> <quic_vnivarth@quicinc.com> wrote: >>>> Ah, or I guess what you're saying is that the table historically >>>> contained "rounded" rates but that clk_round_rate() isn't returning >>>> nice round rates. OK, but if we truly want to support an inexact >>>> match, you'd want to pick the rate that reduces the error, not just >>>> pick the first one. In other words, something like this (untested): >>>> >>>> freq = clk_round_rate(clk, mult); >>>> diff = abs(((long)mult - freq) / div); >>>> if (diff < best_diff) { >>>> best_diff = diff; >>>> ser_clk = freq; >>>> best_div = div; >>>> } >>>> I am not sure if its required that freq is a multiple of best_div now >>>> that we don't have a multiple of desired_clk anyway. >>> How about just this (untested): >>> >>> freq = clk_round_rate(clk, mult); >>> candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); >>> candidate_freq = freq / candidate_div; >>> diff = abs((long)desired_clk - candidate_freq); >>> if (diff < best_diff) { >>> best_diff = diff; >>> ser_clk = freq; >>> best_div = candidate_div; >>> } >> I am afraid this still doesn't guarantee that ser_clk is a multiple of >> best_div > OK. ...I guess my question would be: does it matter for some reason? > "ser_clk" is just a local variable in this function. Who cares if it's > not a multiple of best_div? This is why we're keeping track of > "best_div" in the first place, so that later in the function instead > of: > > *clk_div = ser_clk / desired_clk; > if (!(*clk_div)) > *clk_div = 1; > > You just do: > > *clk_div = best_div; My only concern continues to be... Given ser_clk is the final frequency that this function is going to return and best_div is going to be the clk_divider, is it ok if the divider cant divide the frequency exactly? In other words, Can this function output combinations like (402,4) (501,5) ? If ok, then we can go ahead with this patch or even previous perhaps. > >> I tested it with a function simulates clk_round_rate. >> >> static unsigned long clk_round_rate_test(struct clk *clk, unsigned long >> in_freq) >> { >> unsigned long root_freq[6] = {105, 204, 303, 402, 501, 602}; >> int i; >> >> for (i = 0; i < 6; i++) { >> if (root_freq[i] >= in_freq) >> return root_freq[i]; >> } >> return root_freq[6]; >> } >> >> { >> unsigned long ser_clk; >> unsigned long desired_clk; >> unsigned long freq; >> int div_round_closest; >> unsigned long div; >> unsigned long mult; >> unsigned long candidate_div, candidate_freq; >> >> unsigned long diff, best_diff, best_div; >> unsigned long one; >> >> desired_clk = 100; >> one = 1; >> best_diff = ULONG_MAX; >> pr_err("\ndesired_clk-%d\n", desired_clk); >> for (div = 1; div <= 10; div++) { >> mult = div * desired_clk; >> >> freq = clk_round_rate_test(clk, mult); >> div_round_closest = DIV_ROUND_CLOSEST(freq, desired_clk); >> candidate_div = max(one, (unsigned long)div_round_closest); >> candidate_freq = freq / candidate_div; >> diff = abs((long)desired_clk - candidate_freq); >> pr_err("div-%d, mult-%d, freq-%d, div_round_closest-%d, >> candidate_div-%d, candidate_freq-%d, diff-%d\n", >> div, mult, freq, div_round_closest, candidate_div, >> candidate_freq, diff); >> if (diff < best_diff) { >> pr_err("This is best so far\n"); >> best_diff = diff; >> ser_clk = freq; >> best_div = candidate_div; >> } >> } >> pr_err("\nbest_diff-%d, ser_clk-%d, best_div-%d\n", >> best_diff, ser_clk, best_div); >> } >> >> And here is the output >> >> [ 17.835167] desired_clk-100 >> [ 17.839567] div-1, mult-100, freq-105, div_round_closest-1, >> candidate_div-1, candidate_freq-105, diff-5 >> [ 17.849220] This is best so far >> [ 17.852458] div-2, mult-200, freq-204, div_round_closest-2, >> candidate_div-2, candidate_freq-102, diff-2 >> [ 17.862104] This is best so far >> [ 17.865345] div-3, mult-300, freq-303, div_round_closest-3, >> candidate_div-3, candidate_freq-101, diff-1 >> [ 17.874995] This is best so far >> [ 17.878237] div-4, mult-400, freq-402, div_round_closest-4, >> candidate_div-4, candidate_freq-100, diff-0 >> [ 17.887882] This is best so far >> [ 17.891118] div-5, mult-500, freq-501, div_round_closest-5, >> candidate_div-5, candidate_freq-100, diff-0 >> [ 17.900770] div-6, mult-600, freq-602, div_round_closest-6, >> candidate_div-6, candidate_freq-100, diff-0 >> [ 17.910415] div-7, mult-700, freq-602, div_round_closest-6, >> candidate_div-6, candidate_freq-100, diff-0 >> [ 17.920057] div-8, mult-800, freq-602, div_round_closest-6, >> candidate_div-6, candidate_freq-100, diff-0 >> [ 17.929703] div-9, mult-900, freq-602, div_round_closest-6, >> candidate_div-6, candidate_freq-100, diff-0 >> [ 17.939353] div-10, mult-1000, freq-602, div_round_closest-6, >> candidate_div-6, candidate_freq-100, diff-0 >> [ 17.949181] >> [ 17.949181] best_diff-0, ser_clk-402, best_div-4 > That doesn't look like a terrible result. I guess nominally 602 is a > better approximation, but if we're accepting that we're not going to > have an exact rate anyway then maybe being off by that tiny amount > doesn't matter and we'd do better with the slow clock (maybe saves > power?) Actually power saving was the anticipation behind returning first frequency in original patch, when we cant find exact frequency. > >> Please note that we go past cases when we have an divider that can >> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one >> that doesn't. > Ah, good point. Luckily that's a 1-line fix, right? Apologies, I could not figure out how. Thank you. > > > -Doug
Hi, On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Hi, > > On 6/7/2022 1:29 AM, Doug Anderson wrote: > > Hi, > > > > On Mon, Jun 6, 2022 at 11:19 AM Vijaya Krishna Nivarthi > > <quic_vnivarth@quicinc.com> wrote: > >> Hi, > >> > >> > >> On 6/4/2022 12:10 AM, Doug Anderson wrote: > >>> Hi, > >>> > >>> On Fri, Jun 3, 2022 at 10:43 AM Vijaya Krishna Nivarthi > >>> <quic_vnivarth@quicinc.com> wrote: > >>>> Ah, or I guess what you're saying is that the table historically > >>>> contained "rounded" rates but that clk_round_rate() isn't returning > >>>> nice round rates. OK, but if we truly want to support an inexact > >>>> match, you'd want to pick the rate that reduces the error, not just > >>>> pick the first one. In other words, something like this (untested): > >>>> > >>>> freq = clk_round_rate(clk, mult); > >>>> diff = abs(((long)mult - freq) / div); > >>>> if (diff < best_diff) { > >>>> best_diff = diff; > >>>> ser_clk = freq; > >>>> best_div = div; > >>>> } > >>>> I am not sure if its required that freq is a multiple of best_div now > >>>> that we don't have a multiple of desired_clk anyway. > >>> How about just this (untested): > >>> > >>> freq = clk_round_rate(clk, mult); > >>> candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); > >>> candidate_freq = freq / candidate_div; > >>> diff = abs((long)desired_clk - candidate_freq); > >>> if (diff < best_diff) { > >>> best_diff = diff; > >>> ser_clk = freq; > >>> best_div = candidate_div; > >>> } > >> I am afraid this still doesn't guarantee that ser_clk is a multiple of > >> best_div > > OK. ...I guess my question would be: does it matter for some reason? > > "ser_clk" is just a local variable in this function. Who cares if it's > > not a multiple of best_div? This is why we're keeping track of > > "best_div" in the first place, so that later in the function instead > > of: > > > > *clk_div = ser_clk / desired_clk; > > if (!(*clk_div)) > > *clk_div = 1; > > > > You just do: > > > > *clk_div = best_div; > > My only concern continues to be... > > Given ser_clk is the final frequency that this function is going to > return and best_div is going to be the clk_divider, is it ok if the > divider cant divide the frequency exactly? > > In other words, Can this function output combinations like (402,4) > (501,5) ? > > If ok, then we can go ahead with this patch or even previous perhaps. I don't see why not. You're basically just getting a resulting clock that's not an integral "Hz", right? So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600 * 16) = 153600 Let's imagine that we do all the math and we finally decide that our best bet is with the rate 922000 and a divider of 6. That means that the actual clock we'll make is 153666.67 when we _wanted_ 153600. There's no reason it needs to be integral, though, and 153666.67 would still be better than making 160000. > >> I tested it with a function simulates clk_round_rate. > >> > >> static unsigned long clk_round_rate_test(struct clk *clk, unsigned long > >> in_freq) > >> { > >> unsigned long root_freq[6] = {105, 204, 303, 402, 501, 602}; > >> int i; > >> > >> for (i = 0; i < 6; i++) { > >> if (root_freq[i] >= in_freq) > >> return root_freq[i]; > >> } > >> return root_freq[6]; > >> } > >> > >> { > >> unsigned long ser_clk; > >> unsigned long desired_clk; > >> unsigned long freq; > >> int div_round_closest; > >> unsigned long div; > >> unsigned long mult; > >> unsigned long candidate_div, candidate_freq; > >> > >> unsigned long diff, best_diff, best_div; > >> unsigned long one; > >> > >> desired_clk = 100; > >> one = 1; > >> best_diff = ULONG_MAX; > >> pr_err("\ndesired_clk-%d\n", desired_clk); > >> for (div = 1; div <= 10; div++) { > >> mult = div * desired_clk; > >> > >> freq = clk_round_rate_test(clk, mult); > >> div_round_closest = DIV_ROUND_CLOSEST(freq, desired_clk); > >> candidate_div = max(one, (unsigned long)div_round_closest); > >> candidate_freq = freq / candidate_div; > >> diff = abs((long)desired_clk - candidate_freq); > >> pr_err("div-%d, mult-%d, freq-%d, div_round_closest-%d, > >> candidate_div-%d, candidate_freq-%d, diff-%d\n", > >> div, mult, freq, div_round_closest, candidate_div, > >> candidate_freq, diff); > >> if (diff < best_diff) { > >> pr_err("This is best so far\n"); > >> best_diff = diff; > >> ser_clk = freq; > >> best_div = candidate_div; > >> } > >> } > >> pr_err("\nbest_diff-%d, ser_clk-%d, best_div-%d\n", > >> best_diff, ser_clk, best_div); > >> } > >> > >> And here is the output > >> > >> [ 17.835167] desired_clk-100 > >> [ 17.839567] div-1, mult-100, freq-105, div_round_closest-1, > >> candidate_div-1, candidate_freq-105, diff-5 > >> [ 17.849220] This is best so far > >> [ 17.852458] div-2, mult-200, freq-204, div_round_closest-2, > >> candidate_div-2, candidate_freq-102, diff-2 > >> [ 17.862104] This is best so far > >> [ 17.865345] div-3, mult-300, freq-303, div_round_closest-3, > >> candidate_div-3, candidate_freq-101, diff-1 > >> [ 17.874995] This is best so far > >> [ 17.878237] div-4, mult-400, freq-402, div_round_closest-4, > >> candidate_div-4, candidate_freq-100, diff-0 > >> [ 17.887882] This is best so far > >> [ 17.891118] div-5, mult-500, freq-501, div_round_closest-5, > >> candidate_div-5, candidate_freq-100, diff-0 > >> [ 17.900770] div-6, mult-600, freq-602, div_round_closest-6, > >> candidate_div-6, candidate_freq-100, diff-0 > >> [ 17.910415] div-7, mult-700, freq-602, div_round_closest-6, > >> candidate_div-6, candidate_freq-100, diff-0 > >> [ 17.920057] div-8, mult-800, freq-602, div_round_closest-6, > >> candidate_div-6, candidate_freq-100, diff-0 > >> [ 17.929703] div-9, mult-900, freq-602, div_round_closest-6, > >> candidate_div-6, candidate_freq-100, diff-0 > >> [ 17.939353] div-10, mult-1000, freq-602, div_round_closest-6, > >> candidate_div-6, candidate_freq-100, diff-0 > >> [ 17.949181] > >> [ 17.949181] best_diff-0, ser_clk-402, best_div-4 > > That doesn't look like a terrible result. I guess nominally 602 is a > > better approximation, but if we're accepting that we're not going to > > have an exact rate anyway then maybe being off by that tiny amount > > doesn't matter and we'd do better with the slow clock (maybe saves > > power?) > Actually power saving was the anticipation behind returning first > frequency in original patch, when we cant find exact frequency. Right, except that if you just pick the first clock you find it would be _wildly_ off. I guess if you really want to do this the right way, you need to set a maximum tolerance and pick the first rate you find that meets that tolerance. Random web search for "uart baud rate tolerance" makes me believe that +/- 5% deviation is OK, but to be safe you probably want something lower. Maybe 2%? So if the desired clock is within 2% of a clock you can make, can you just pick that one? > >> Please note that we go past cases when we have an divider that can > >> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one > >> that doesn't. > > Ah, good point. Luckily that's a 1-line fix, right? > > Apologies, I could not figure out how. Ah, sorry. Not quite 1 line, but this (untested) freq = clk_round_rate(clk, mult); if (freq % desired_clk == 0) { ser_clk = freq; best_div = freq / desired_clk; break; } candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); candidate_freq = freq / candidate_div; diff = abs((long)desired_clk - candidate_freq); if (diff < best_diff) { best_diff = diff; ser_clk = freq; best_div = candidate_div; }
Hi, On 6/8/2022 12:55 AM, Doug Anderson wrote: > Hi, > > On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Hi, >> >> On 6/7/2022 1:29 AM, Doug Anderson wrote: >> >> My only concern continues to be... >> >> Given ser_clk is the final frequency that this function is going to >> return and best_div is going to be the clk_divider, is it ok if the >> divider cant divide the frequency exactly? >> >> In other words, Can this function output combinations like (402,4) >> (501,5) ? >> >> If ok, then we can go ahead with this patch or even previous perhaps. > I don't see why not. You're basically just getting a resulting clock > that's not an integral "Hz", right? > > So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600 > * 16) = 153600 > > Let's imagine that we do all the math and we finally decide that our > best bet is with the rate 922000 and a divider of 6. That means that > the actual clock we'll make is 153666.67 when we _wanted_ 153600. > There's no reason it needs to be integral, though, and 153666.67 would > still be better than making 160000. > Thank you for clarification. >>> power?) >> Actually power saving was the anticipation behind returning first >> frequency in original patch, when we cant find exact frequency. > Right, except that if you just pick the first clock you find it would > be _wildly_ off. I guess if you really want to do this the right way, > you need to set a maximum tolerance and pick the first rate you find > that meets that tolerance. Random web search for "uart baud rate > tolerance" makes me believe that +/- 5% deviation is OK, but to be > safe you probably want something lower. Maybe 2%? So if the desired > clock is within 2% of a clock you can make, can you just pick that > one? Ok, 2% seems good. > >>>> Please note that we go past cases when we have an divider that can >>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one >>>> that doesn't. >>> Ah, good point. Luckily that's a 1-line fix, right? >> Apologies, I could not figure out how. > Ah, sorry. Not quite 1 line, but this (untested) > > > freq = clk_round_rate(clk, mult); > > if (freq % desired_clk == 0) { > ser_clk = freq; > best_div = freq / desired_clk; > break; > } > > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); > candidate_freq = freq / candidate_div; > diff = abs((long)desired_clk - candidate_freq); > if (diff < best_diff) { > best_diff = diff; > ser_clk = freq; > best_div = candidate_div; > } But then once again, we would likely need 2 loops because while we are ok with giving up on search for best_div on finding something within 2% tolerance, we may not want to give up on exact match (freq % desired_clk == 0 ) So how about something like this with 2 loops (more optimised than previous version with 2 loops)? (untested) maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; prev = 0; /* run through quicker loop anticipating to find an exact match */ for (div = 1; div <= maxdiv; div++) { mult = (unsigned long long)div * desired_clk; if (mult > ULONG_MAX) break; freq = clk_round_rate(clk, max((unsigned long)mult, prev+1)); if (!(freq % desired_clk)) { *clk_div = freq / desired_clk; return freq; } if (prev && prev == freq) break; prev = freq; } pr_warn("Can't find exact match frequency and divider\n"); freq = 0; best_diff = ULONG_MAX; prev_candidate_div = -1; while (true) { prev = freq; freq = clk_round_rate(clk, freq+1); if (freq == prev) break; /* end of table */ candidate_div = DIV_ROUND_CLOSEST(freq, desired_clk); /* * Since the frequencies are increasing, previous is better * if we have same divider, proceed to next in table */ if (prev_candidate_div == candidate_div) continue; prev_candidate_div = candidate_div; if (candidate_div) candidate_freq = freq / candidate_div; else candidate_freq = freq; diff = abs((long)desired_clk - candidate_freq); if (diff < best_diff) { best_diff = diff; ser_clk = freq; *clk_div = candidate_div; if (diff * 50 < ser_clk) { two_percent_tolerance = true; break; } } } if (!two_percent_tolerance) { pr_warn("Can't find frequency within 2 percent tolerance\n"); } return ser_clk; } Thank you.
Hi, On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> wrote: > > Hi, > > > On 6/8/2022 12:55 AM, Doug Anderson wrote: > > Hi, > > > > On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi > > <quic_vnivarth@quicinc.com> wrote: > >> Hi, > >> > >> On 6/7/2022 1:29 AM, Doug Anderson wrote: > >> > >> My only concern continues to be... > >> > >> Given ser_clk is the final frequency that this function is going to > >> return and best_div is going to be the clk_divider, is it ok if the > >> divider cant divide the frequency exactly? > >> > >> In other words, Can this function output combinations like (402,4) > >> (501,5) ? > >> > >> If ok, then we can go ahead with this patch or even previous perhaps. > > I don't see why not. You're basically just getting a resulting clock > > that's not an integral "Hz", right? > > > > So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600 > > * 16) = 153600 > > > > Let's imagine that we do all the math and we finally decide that our > > best bet is with the rate 922000 and a divider of 6. That means that > > the actual clock we'll make is 153666.67 when we _wanted_ 153600. > > There's no reason it needs to be integral, though, and 153666.67 would > > still be better than making 160000. > > > Thank you for clarification. > >>> power?) > >> Actually power saving was the anticipation behind returning first > >> frequency in original patch, when we cant find exact frequency. > > Right, except that if you just pick the first clock you find it would > > be _wildly_ off. I guess if you really want to do this the right way, > > you need to set a maximum tolerance and pick the first rate you find > > that meets that tolerance. Random web search for "uart baud rate > > tolerance" makes me believe that +/- 5% deviation is OK, but to be > > safe you probably want something lower. Maybe 2%? So if the desired > > clock is within 2% of a clock you can make, can you just pick that > > one? > Ok, 2% seems good. > > > >>>> Please note that we go past cases when we have an divider that can > >>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one > >>>> that doesn't. > >>> Ah, good point. Luckily that's a 1-line fix, right? > >> Apologies, I could not figure out how. > > Ah, sorry. Not quite 1 line, but this (untested) > > > > > > freq = clk_round_rate(clk, mult); > > > > if (freq % desired_clk == 0) { > > ser_clk = freq; > > best_div = freq / desired_clk; > > break; > > } > > > > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); > > candidate_freq = freq / candidate_div; > > diff = abs((long)desired_clk - candidate_freq); > > if (diff < best_diff) { > > best_diff = diff; > > ser_clk = freq; > > best_div = candidate_div; > > } > > But then once again, we would likely need 2 loops because while we are > ok with giving up on search for best_div on finding something within 2% > tolerance, we may not want to give up on exact match (freq % desired_clk > == 0 ) Ah, it took me a while to understand why two loops. It's because in one case you're trying multiplies and in the other you're bumping up to the next closest clock rate. I don't think you really need to do that. Just test the (rate - 2%) and the rate. How about this (only lightly tested): ser_clk = 0; maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; div = 1; while (div < maxdiv) { mult = (unsigned long long)div * desired_clk; if (mult != (unsigned long)mult) break; two_percent = mult / 50; /* * Loop requesting (freq - 2%) and possibly (freq). * * We'll keep track of the lowest freq inexact match we found * but always try to find a perfect match. NOTE: this algorithm * could miss a slightly better freq if there's more than one * freq between (freq - 2%) and (freq) but (freq) can't be made * exactly, but that's OK. * * This absolutely relies on the fact that the Qualcomm clock * driver always rounds up. */ test_freq = mult - two_percent; while (test_freq <= mult) { freq = clk_round_rate(clk, test_freq); /* * A dead-on freq is an insta-win. This implicitly * handles when "freq == mult" */ if (!(freq % desired_clk)) { *clk_div = freq / desired_clk; return freq; } /* * Only time clock framework doesn't round up is if * we're past the max clock rate. We're done searching * if that's the case. */ if (freq < test_freq) return ser_clk; /* Save the first (lowest freq) within 2% */ if (!ser_clk && freq <= mult + two_percent) { ser_clk = freq; *clk_div = div; } /* * If we already rounded up past mult then this will * cause the loop to exit. If not then this will run * the loop a second time with exactly mult. */ test_freq = max(freq + 1, mult); } /* * test_freq will always be bigger than mult by at least 1. * That means we can get the next divider with a DIV_ROUND_UP. * This has the advantage of skipping by a whole bunch of divs * If the clock framework already bypassed them. */ div = DIV_ROUND_UP(test_freq, desired_clk); } return ser_clk;
Hi, On 6/9/2022 4:07 AM, Doug Anderson wrote: > Hi, > > On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Hi, >> >> >> On 6/8/2022 12:55 AM, Doug Anderson wrote: >>> Hi, >>> >>> On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi >>> <quic_vnivarth@quicinc.com> wrote: >>> Ah, sorry. Not quite 1 line, but this (untested) >>> >>> freq = clk_round_rate(clk, mult); >>> >>> if (freq % desired_clk == 0) { >>> ser_clk = freq; >>> best_div = freq / desired_clk; >>> break; >>> } >>> >>> candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); >>> candidate_freq = freq / candidate_div; >>> diff = abs((long)desired_clk - candidate_freq); >>> if (diff < best_diff) { >>> best_diff = diff; >>> ser_clk = freq; >>> best_div = candidate_div; >>> } >> But then once again, we would likely need 2 loops because while we are >> ok with giving up on search for best_div on finding something within 2% >> tolerance, we may not want to give up on exact match (freq % desired_clk >> == 0 ) > Ah, it took me a while to understand why two loops. It's because in > one case you're trying multiplies and in the other you're bumping up > to the next closest clock rate. I don't think you really need to do > that. Just test the (rate - 2%) and the rate. How about this (only > lightly tested): > > ser_clk = 0; > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > div = 1; > while (div < maxdiv) { div <= maxdiv ? > mult = (unsigned long long)div * desired_clk; > if (mult != (unsigned long)mult) > break; > > two_percent = mult / 50; > > /* > * Loop requesting (freq - 2%) and possibly (freq). > * > * We'll keep track of the lowest freq inexact match we found > * but always try to find a perfect match. NOTE: this algorithm > * could miss a slightly better freq if there's more than one > * freq between (freq - 2%) and (freq) but (freq) can't be made > * exactly, but that's OK. > * > * This absolutely relies on the fact that the Qualcomm clock > * driver always rounds up. > */ > test_freq = mult - two_percent; > while (test_freq <= mult) { > freq = clk_round_rate(clk, test_freq); > > /* > * A dead-on freq is an insta-win. This implicitly > * handles when "freq == mult" > */ > if (!(freq % desired_clk)) { > *clk_div = freq / desired_clk; > return freq; > } > > /* > * Only time clock framework doesn't round up is if > * we're past the max clock rate. We're done searching > * if that's the case. > */ > if (freq < test_freq) > return ser_clk; > > /* Save the first (lowest freq) within 2% */ > if (!ser_clk && freq <= mult + two_percent) { > ser_clk = freq; > *clk_div = div; > } My last concern is with search happening only within 2% tolerance. Do we fail otherwise? This real case has best tolerance of 1.9%. [ 17.963672] 20220530 desired_clk-51200000 [ 21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000 Seems close. Thank you. > > /* > * If we already rounded up past mult then this will > * cause the loop to exit. If not then this will run > * the loop a second time with exactly mult. > */ > test_freq = max(freq + 1, mult); > } > > /* > * test_freq will always be bigger than mult by at least 1. > * That means we can get the next divider with a DIV_ROUND_UP. > * This has the advantage of skipping by a whole bunch of divs > * If the clock framework already bypassed them. > */ > div = DIV_ROUND_UP(test_freq, desired_clk); > } > > return ser_clk;
Hi, Re-sending as my earlier message bounced. On 6/9/2022 4:07 AM, Doug Anderson wrote: > Hi, > > On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi > <quic_vnivarth@quicinc.com> wrote: >> Hi, >> >> >> On 6/8/2022 12:55 AM, Doug Anderson wrote: >>> Hi, >>> >>> On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi >>> <quic_vnivarth@quicinc.com> wrote: >>>> Hi, >>>> >>>> On 6/7/2022 1:29 AM, Doug Anderson wrote: >>>> >>>> My only concern continues to be... >>>> >>>> Given ser_clk is the final frequency that this function is going to >>>> return and best_div is going to be the clk_divider, is it ok if the >>>> divider cant divide the frequency exactly? >>>> >>>> In other words, Can this function output combinations like (402,4) >>>> (501,5) ? >>>> >>>> If ok, then we can go ahead with this patch or even previous perhaps. >>> I don't see why not. You're basically just getting a resulting clock >>> that's not an integral "Hz", right? >>> >>> So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600 >>> * 16) = 153600 >>> >>> Let's imagine that we do all the math and we finally decide that our >>> best bet is with the rate 922000 and a divider of 6. That means that >>> the actual clock we'll make is 153666.67 when we _wanted_ 153600. >>> There's no reason it needs to be integral, though, and 153666.67 would >>> still be better than making 160000. >>> >> Thank you for clarification. >>>>> power?) >>>> Actually power saving was the anticipation behind returning first >>>> frequency in original patch, when we cant find exact frequency. >>> Right, except that if you just pick the first clock you find it would >>> be _wildly_ off. I guess if you really want to do this the right way, >>> you need to set a maximum tolerance and pick the first rate you find >>> that meets that tolerance. Random web search for "uart baud rate >>> tolerance" makes me believe that +/- 5% deviation is OK, but to be >>> safe you probably want something lower. Maybe 2%? So if the desired >>> clock is within 2% of a clock you can make, can you just pick that >>> one? >> Ok, 2% seems good. >>>>>> Please note that we go past cases when we have an divider that can >>>>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one >>>>>> that doesn't. >>>>> Ah, good point. Luckily that's a 1-line fix, right? >>>> Apologies, I could not figure out how. >>> Ah, sorry. Not quite 1 line, but this (untested) >>> >>> >>> freq = clk_round_rate(clk, mult); >>> >>> if (freq % desired_clk == 0) { >>> ser_clk = freq; >>> best_div = freq / desired_clk; >>> break; >>> } >>> >>> candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk)); >>> candidate_freq = freq / candidate_div; >>> diff = abs((long)desired_clk - candidate_freq); >>> if (diff < best_diff) { >>> best_diff = diff; >>> ser_clk = freq; >>> best_div = candidate_div; >>> } >> But then once again, we would likely need 2 loops because while we are >> ok with giving up on search for best_div on finding something within 2% >> tolerance, we may not want to give up on exact match (freq % desired_clk >> == 0 ) > Ah, it took me a while to understand why two loops. It's because in > one case you're trying multiplies and in the other you're bumping up > to the next closest clock rate. I don't think you really need to do > that. Just test the (rate - 2%) and the rate. How about this (only > lightly tested): > > ser_clk = 0; > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > div = 1; > while (div < maxdiv) { div <= maxdiv ? > mult = (unsigned long long)div * desired_clk; > if (mult != (unsigned long)mult) > break; > > two_percent = mult / 50; > > /* > * Loop requesting (freq - 2%) and possibly (freq). > * > * We'll keep track of the lowest freq inexact match we found > * but always try to find a perfect match. NOTE: this algorithm > * could miss a slightly better freq if there's more than one > * freq between (freq - 2%) and (freq) but (freq) can't be made > * exactly, but that's OK. > * > * This absolutely relies on the fact that the Qualcomm clock > * driver always rounds up. > */ > test_freq = mult - two_percent; > while (test_freq <= mult) { > freq = clk_round_rate(clk, test_freq); > > /* > * A dead-on freq is an insta-win. This implicitly > * handles when "freq == mult" > */ > if (!(freq % desired_clk)) { > *clk_div = freq / desired_clk; > return freq; > } > > /* > * Only time clock framework doesn't round up is if > * we're past the max clock rate. We're done searching > * if that's the case. > */ > if (freq < test_freq) > return ser_clk; > > /* Save the first (lowest freq) within 2% */ > if (!ser_clk && freq <= mult + two_percent) { > ser_clk = freq; > *clk_div = div; > } My last concern is with search happening only within 2% tolerance. Do we fail otherwise? This real case has best tolerance of 1.9%. [ 17.963672] 20220530 desired_clk-51200000 [ 21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000 Seems close. Thank you. > > /* > * If we already rounded up past mult then this will > * cause the loop to exit. If not then this will run > * the loop a second time with exactly mult. > */ > test_freq = max(freq + 1, mult); > } > > /* > * test_freq will always be bigger than mult by at least 1. > * That means we can get the next divider with a DIV_ROUND_UP. > * This has the advantage of skipping by a whole bunch of divs > * If the clock framework already bypassed them. > */ > div = DIV_ROUND_UP(test_freq, desired_clk); > } > > return ser_clk;
Hi, Re-sending (2nd attempt) as emails are bouncing... > > > > But then once again, we would likely need 2 loops because while we are > > ok with giving up on search for best_div on finding something within > > 2% tolerance, we may not want to give up on exact match (freq % > > desired_clk == 0 ) > > Ah, it took me a while to understand why two loops. It's because in one case > you're trying multiplies and in the other you're bumping up to the next > closest clock rate. I don't think you really need to do that. Just test the (rate - > 2%) and the rate. How about this (only lightly tested): > > ser_clk = 0; > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > div = 1; > while (div < maxdiv) { div <= maxdiv ? > mult = (unsigned long long)div * desired_clk; > if (mult != (unsigned long)mult) > break; > > two_percent = mult / 50; > > /* > * Loop requesting (freq - 2%) and possibly (freq). > * > * We'll keep track of the lowest freq inexact match we found > * but always try to find a perfect match. NOTE: this algorithm > * could miss a slightly better freq if there's more than one > * freq between (freq - 2%) and (freq) but (freq) can't be made > * exactly, but that's OK. > * > * This absolutely relies on the fact that the Qualcomm clock > * driver always rounds up. > */ > test_freq = mult - two_percent; > while (test_freq <= mult) { > freq = clk_round_rate(clk, test_freq); > > /* > * A dead-on freq is an insta-win. This implicitly > * handles when "freq == mult" > */ > if (!(freq % desired_clk)) { > *clk_div = freq / desired_clk; > return freq; > } > > /* > * Only time clock framework doesn't round up is if > * we're past the max clock rate. We're done searching > * if that's the case. > */ > if (freq < test_freq) > return ser_clk; > > /* Save the first (lowest freq) within 2% */ > if (!ser_clk && freq <= mult + two_percent) { > ser_clk = freq; > *clk_div = div; > } My last concern is with search happening only within 2% tolerance. Do we fail otherwise? This real case has best tolerance of 1.9% and seems close. [ 17.963672] 20220530 desired_clk-51200000 [ 21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000 Perhaps we can fallback on 1st clock rate? Thank you. > > /* > * If we already rounded up past mult then this will > * cause the loop to exit. If not then this will run > * the loop a second time with exactly mult. > */ > test_freq = max(freq + 1, mult); > } > > /* > * test_freq will always be bigger than mult by at least 1. > * That means we can get the next divider with a DIV_ROUND_UP. > * This has the advantage of skipping by a whole bunch of divs > * If the clock framework already bypassed them. > */ > div = DIV_ROUND_UP(test_freq, desired_clk); > } > > return ser_clk;
Hi, On Fri, Jun 10, 2022 at 2:33 AM Vijaya Krishna Nivarthi (Temp) (QUIC) <quic_vnivarth@quicinc.com> wrote: > > Hi, > > Re-sending (2nd attempt) as emails are bouncing... > > > > > > > > But then once again, we would likely need 2 loops because while we are > > > ok with giving up on search for best_div on finding something within > > > 2% tolerance, we may not want to give up on exact match (freq % > > > desired_clk == 0 ) > > > > Ah, it took me a while to understand why two loops. It's because in one case > > you're trying multiplies and in the other you're bumping up to the next > > closest clock rate. I don't think you really need to do that. Just test the (rate - > > 2%) and the rate. How about this (only lightly tested): > > > > ser_clk = 0; > > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > > div = 1; > > while (div < maxdiv) { > > > div <= maxdiv ? Ah, sure. > > mult = (unsigned long long)div * desired_clk; > > if (mult != (unsigned long)mult) > > break; > > > > two_percent = mult / 50; > > > > /* > > * Loop requesting (freq - 2%) and possibly (freq). > > * > > * We'll keep track of the lowest freq inexact match we found > > * but always try to find a perfect match. NOTE: this algorithm > > * could miss a slightly better freq if there's more than one > > * freq between (freq - 2%) and (freq) but (freq) can't be made > > * exactly, but that's OK. > > * > > * This absolutely relies on the fact that the Qualcomm clock > > * driver always rounds up. > > */ > > test_freq = mult - two_percent; > > while (test_freq <= mult) { > > freq = clk_round_rate(clk, test_freq); > > > > /* > > * A dead-on freq is an insta-win. This implicitly > > * handles when "freq == mult" > > */ > > if (!(freq % desired_clk)) { > > *clk_div = freq / desired_clk; > > return freq; > > } > > > > /* > > * Only time clock framework doesn't round up is if > > * we're past the max clock rate. We're done searching > > * if that's the case. > > */ > > if (freq < test_freq) > > return ser_clk; > > > > /* Save the first (lowest freq) within 2% */ > > if (!ser_clk && freq <= mult + two_percent) { > > ser_clk = freq; > > *clk_div = div; > > } > > My last concern is with search happening only within 2% tolerance. > Do we fail otherwise? > > This real case has best tolerance of 1.9% and seems close. > > [ 17.963672] 20220530 desired_clk-51200000 > [ 21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000 > > Perhaps we can fallback on 1st clock rate? I don't feel super comfortable just blindly falling back on the 1st clock rate. It could be wildly (more than 5%) wrong, can't it? IMO: * If you're not comfortable with 2%, you could always pick 3% or 4%. As I said, my random web search seemed to indicate that up to 5% was perhaps OK. * It's probably overkill, but you could abstract the whole search out and try searching once for 2% and then try 4%? -Doug
Hi, > > > > > > ser_clk = 0; > > > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > > > div = 1; > > > while (div < maxdiv) { > > > > > > div <= maxdiv ? > > Ah, sure. Thank you. > > > > > mult = (unsigned long long)div * desired_clk; > > > if (mult != (unsigned long)mult) > > > break; > > > > > > two_percent = mult / 50; > > > > > > /* > > > * Loop requesting (freq - 2%) and possibly (freq). > > > * > > > * We'll keep track of the lowest freq inexact match we found > > > * but always try to find a perfect match. NOTE: this algorithm > > > * could miss a slightly better freq if there's more than one > > > * freq between (freq - 2%) and (freq) but (freq) can't be made > > > * exactly, but that's OK. > > > * > > > * This absolutely relies on the fact that the Qualcomm clock > > > * driver always rounds up. > > > */ > > > test_freq = mult - two_percent; > > > while (test_freq <= mult) { > > > freq = clk_round_rate(clk, test_freq); > > > > > > /* > > > * A dead-on freq is an insta-win. This implicitly > > > * handles when "freq == mult" > > > */ > > > if (!(freq % desired_clk)) { > > > *clk_div = freq / desired_clk; > > > return freq; > > > } > > > > > > /* > > > * Only time clock framework doesn't round up is if > > > * we're past the max clock rate. We're done searching > > > * if that's the case. > > > */ > > > if (freq < test_freq) > > > return ser_clk; > > > > > > /* Save the first (lowest freq) within 2% */ > > > if (!ser_clk && freq <= mult + two_percent) { > > > ser_clk = freq; > > > *clk_div = div; > > > } > > > > My last concern is with search happening only within 2% tolerance. > > Do we fail otherwise? > > > > This real case has best tolerance of 1.9% and seems close. > > > > [ 17.963672] 20220530 desired_clk-51200000 > > [ 21.193550] 20220530 returning ser_clk-52174000, div-1, diff-974000 > > > > Perhaps we can fallback on 1st clock rate? > > I don't feel super comfortable just blindly falling back on the 1st clock rate. It > could be wildly (more than 5%) wrong, can't it? > > IMO: > * If you're not comfortable with 2%, you could always pick 3% or 4%. > As I said, my random web search seemed to indicate that up to 5% was > perhaps OK. > * It's probably overkill, but you could abstract the whole search out and try > searching once for 2% and then try 4%? > Ok, I will implement a function that searches within an input tolerance. And have a conditional 2nd call to same with higher tolerance of 5%. This would mean that we will still run through 2 loops in some cases, but that’s ok. Thank you. > > -Doug
Hi, For desired_clk = 100 and clock rates like 1st from below, DIV_ROUND_UP seems to cause missing candidate solutions. static unsigned long clk_round_rate_test(struct clk *clk, unsigned long in_freq) { //unsigned long root_freq[] = {301, 702, 1004}; //unsigned long root_freq[] = {301, 702, 1004, 2000, 3000}; //unsigned long root_freq[] = {50, 97, 99}; //unsigned long root_freq[] = {50, 97, 99, 200}; //unsigned long root_freq[] = {92, 110, 193, 230}; //unsigned long root_freq[] = {92, 110, 193, 230, 300, 401}; //unsigned long root_freq[] = {92, 110, 193, 230, 295, 296, 297, 401}; //unsigned long root_freq[] = {92, 110, 193, 230, 295, 296, 297, 300, 401}; //unsigned long root_freq[] = {197, 198, 199}; unsigned long root_freq[] = {197, 198, 199, 200}; int i; size_t n = sizeof root_freq / sizeof *root_freq; for (i = 0; i < n; i++) { if (root_freq[i] >= in_freq) return root_freq[i]; } return root_freq[n-1]; } I modified to handle such cases, optimised little and uploaded a patch. It seems to work for all the cases like above. Thank you, Vijay/ > > Ah, it took me a while to understand why two loops. It's because in one case > you're trying multiplies and in the other you're bumping up to the next > closest clock rate. I don't think you really need to do that. Just test the (rate - > 2%) and the rate. How about this (only lightly tested): > > ser_clk = 0; > maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT; > div = 1; > while (div < maxdiv) { > mult = (unsigned long long)div * desired_clk; > if (mult != (unsigned long)mult) > break; > > two_percent = mult / 50; > > /* > * Loop requesting (freq - 2%) and possibly (freq). > * > * We'll keep track of the lowest freq inexact match we found > * but always try to find a perfect match. NOTE: this algorithm > * could miss a slightly better freq if there's more than one > * freq between (freq - 2%) and (freq) but (freq) can't be made > * exactly, but that's OK. > * > * This absolutely relies on the fact that the Qualcomm clock > * driver always rounds up. > */ > test_freq = mult - two_percent; > while (test_freq <= mult) { > freq = clk_round_rate(clk, test_freq); > > /* > * A dead-on freq is an insta-win. This implicitly > * handles when "freq == mult" > */ > if (!(freq % desired_clk)) { > *clk_div = freq / desired_clk; > return freq; > } > > /* > * Only time clock framework doesn't round up is if > * we're past the max clock rate. We're done searching > * if that's the case. > */ > if (freq < test_freq) > return ser_clk; > > /* Save the first (lowest freq) within 2% */ > if (!ser_clk && freq <= mult + two_percent) { > ser_clk = freq; > *clk_div = div; > } > > /* > * If we already rounded up past mult then this will > * cause the loop to exit. If not then this will run > * the loop a second time with exactly mult. > */ > test_freq = max(freq + 1, mult); > } > > /* > * test_freq will always be bigger than mult by at least 1. > * That means we can get the next divider with a DIV_ROUND_UP. > * This has the advantage of skipping by a whole bunch of divs > * If the clock framework already bypassed them. > */ > div = DIV_ROUND_UP(test_freq, desired_clk); > } > > return ser_clk;
Hi, On Tue, Jun 21, 2022 at 10:58 AM Vijaya Krishna Nivarthi (Temp) <vnivarth@qti.qualcomm.com> wrote: > > Hi, > > For desired_clk = 100 and clock rates like 1st from below, DIV_ROUND_UP seems to cause missing candidate solutions. > > static unsigned long clk_round_rate_test(struct clk *clk, unsigned long in_freq) > { > //unsigned long root_freq[] = {301, 702, 1004}; > //unsigned long root_freq[] = {301, 702, 1004, 2000, 3000}; > //unsigned long root_freq[] = {50, 97, 99}; > //unsigned long root_freq[] = {50, 97, 99, 200}; > //unsigned long root_freq[] = {92, 110, 193, 230}; > //unsigned long root_freq[] = {92, 110, 193, 230, 300, 401}; > //unsigned long root_freq[] = {92, 110, 193, 230, 295, 296, 297, 401}; > //unsigned long root_freq[] = {92, 110, 193, 230, 295, 296, 297, 300, 401}; > //unsigned long root_freq[] = {197, 198, 199}; > unsigned long root_freq[] = {197, 198, 199, 200}; > int i; > size_t n = sizeof root_freq / sizeof *root_freq; > > for (i = 0; i < n; i++) { > if (root_freq[i] >= in_freq) > return root_freq[i]; > } > return root_freq[n-1]; > } > > I modified to handle such cases, optimised little and uploaded a patch. > It seems to work for all the cases like above. I think it would have been simpler to just change this little section: /* Save the first (lowest freq) within 2% */ actual_mult = DIV_ROUND_CLOSEST(freq, desired_clk) * desired_clk; if (!ser_clk && freq <= actual_mult + two_percent) { ser_clk = freq; *clk_div = div; } That was the only bug, right? Then you could keep the DIV_ROUND_UP() solution? -Doug
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 4733a23..08f3ad4 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -943,11 +943,11 @@ static int qcom_geni_serial_startup(struct uart_port *uport) static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, unsigned int sampling_rate, unsigned int *clk_div) { - unsigned long ser_clk; + unsigned long ser_clk = 0; unsigned long desired_clk; unsigned long freq, prev; unsigned long div, maxdiv; - int64_t mult; + unsigned long long mult; desired_clk = baud * sampling_rate; if (!desired_clk) { @@ -959,8 +959,8 @@ static unsigned long get_clk_div_rate(struct clk *clk, unsigned int baud, prev = 0; for (div = 1; div <= maxdiv; div++) { - mult = div * desired_clk; - if (mult > ULONG_MAX) + mult = (unsigned long long)div * (unsigned long long)desired_clk; + if (mult > (unsigned long long)ULONG_MAX) break; freq = clk_round_rate(clk, (unsigned long)mult);
Add missing initialisation and correct type casting Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com> --- drivers/tty/serial/qcom_geni_serial.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)