diff mbox series

[v4,4/4] pwm: meson: make full use of common clock framework

Message ID cb79d313-c7a2-42e9-639a-63cb5366521a@gmail.com (mailing list archive)
State Superseded
Headers show
Series pwm: meson: make full use of common clock framework | expand

Commit Message

Heiner Kallweit April 13, 2023, 5:54 a.m. UTC
Newer versions of the PWM block use a core clock with external mux,
divider, and gate. These components either don't exist any longer in
the PWM block, or they are bypassed.
To minimize needed changes for supporting the new version, the internal
divider and gate should be handled by CCF too.

I didn't see a good way to split the patch, therefore it's somewhat
bigger. What it does:

- The internal mux is handled by CCF already. Register also internal
  divider and gate with CCF, so that we have one representation of the
  input clock: [mux] parent of [divider] parent of [gate]
  
- Now that CCF selects an appropriate mux parent, we don't need the
  DT-provided default parent any longer. Accordingly we can also omit
  setting the mux parent directly in the driver.
  
- Instead of manually handling the pre-div divider value, let CCF
  set the input clock. Targeted input clock frequency is
  0xffff * 1/period for best precision.
  
- For the "inverted pwm disabled" scenario target an input clock
  frequency of 1GHz. This ensures that the remaining low pulses
  have minimum length.

I don't have hw with the old PWM block, therefore I couldn't test this
patch. With the not yet included extension for the new PWM block
(channel->clk coming directly from get_clk(external_clk)) I didn't
notice any problem. My system uses PWM for the CPU voltage regulator
and for the SDIO 32kHz clock.

Note: The clock gate in the old PWM block is permanently disabled.
This seems to indicate that it's not used by the new PWM block.

Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
Changes to RFT/RFC version:
- use parent_hws instead of parent_names for div/gate clock
- use devm_clk_hw_register where the struct clk * returned by
  devm_clk_register isn't needed

v2:
- add patch 1
- add patch 3
- switch to using clk_parent_data in all relevant places
v3:
- add flag CLK_IGNORE_UNUSED
v4:
- remove variable tmp in meson_pwm_get_state
- don't use deprecated function devm_clk_register
---
 drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
 1 file changed, 83 insertions(+), 59 deletions(-)

Comments

Martin Blumenstingl April 14, 2023, 7:39 p.m. UTC | #1
Hello Heiner,

On Thu, Apr 13, 2023 at 7:55 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
[...]
> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Unfortunately I have some bad news and I need to take back my Tested-by :-(
Previously my test was: cycle through all available CPU frequencies
while stressing the CPU.
My assumption was: if the system doesn't lock up everything's fine
because we have a high enough voltage.

This evening however I got a memory corruption error while trying to
log in via UART - which I thought was strange.
So I connected my logic analyzer to my Odroid-C1 and did some experiments:

period = 30518, duty cycle = 15259 (typically used for the 32kHz
output to the SDIO wifi chip)
before your patches / after applying your patches:
PWM: duty cycle: 50.000000% / 50.000000%
PWM: period: 30.6 µs / 30.5 µs
Timing: Time: 15.292 µs (65.395 kHz) / 15.250 µs (65.574 kHz)
Timing: Average: 15.296 µs (65.377 kHz) / 15.264 µs (65.513 kHz)
driver debug messages with your patches applied:
fin_freq: 850000000 Hz
period=30518 cnt=25940
duty=15259 duty_cnt=12970

Then I tried period = 12218, duty cycle = 0 (typically used for the
highest CPU voltage):
before your patches / after applying your patches:
PWM: duty cycle: 0.338983% / n/a (constant low output)
PWM: period: 12.3 µs / n/a
Timing: Time: 12.250 µs (81.633 kHz) / n/a
Timing: Average: 6.148 µs (162.668 kHz) / n/a
driver debug messages with your patches applied:
fin_freq: 850000000 Hz
period=12218 cnt=10385

Finally I tried period = 12218, duty cycle = 12218 (typically used for
the lowest CPU voltage):
before your patches / after applying your patches:
PWM: duty cycle: 99.661017% / n/a (constant low output)
PWM: period: 12.3 µs / n/a
Timing: Time: 12.250 µs (81.633 kHz) / n/a
Timing: Average: 6.148 µs (162.668 kHz) / n/a
driver debug messages with your patches applied:
fin_freq: 850000000 Hz
period=12218 cnt=10385

After seeing the constant low output with period 12218 I realized that
my previous test was no good: the CPU was fed the highest possible
voltage all the time.
It's not clear to me why period 12218 would give no PWM output at all
while period 30518 works fine.
I did an experiment by removing CLK_SET_RATE_PARENT from the divider's
init.flags -> now XTAL (24MHz) is the only possible clock (it's the
hardware default). It does indeed bring back the exact same results as
before (where the XTAL clock was also used; with the changes from this
series FCLK_DIV3 is now chosen, which runs at 850MHz).

Do you have any idea what could cause this?
The FCLK_DIV3 input seems to work as otherwise period 30518 would also not work.
The calculated values also look sane, so it's not that we have some
32-bit overflow (as I'm testing on a 32-bit Meson8b SoC).


Best regards,
Martin
Heiner Kallweit April 15, 2023, 6:39 a.m. UTC | #2
On 14.04.2023 21:39, Martin Blumenstingl wrote:
> Hello Heiner,
> 
> On Thu, Apr 13, 2023 at 7:55 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
> [...]
>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Unfortunately I have some bad news and I need to take back my Tested-by :-(
> Previously my test was: cycle through all available CPU frequencies
> while stressing the CPU.
> My assumption was: if the system doesn't lock up everything's fine
> because we have a high enough voltage.
> 
> This evening however I got a memory corruption error while trying to
> log in via UART - which I thought was strange.
> So I connected my logic analyzer to my Odroid-C1 and did some experiments:
> 
> period = 30518, duty cycle = 15259 (typically used for the 32kHz
> output to the SDIO wifi chip)
> before your patches / after applying your patches:
> PWM: duty cycle: 50.000000% / 50.000000%
> PWM: period: 30.6 µs / 30.5 µs
> Timing: Time: 15.292 µs (65.395 kHz) / 15.250 µs (65.574 kHz)
> Timing: Average: 15.296 µs (65.377 kHz) / 15.264 µs (65.513 kHz)
> driver debug messages with your patches applied:
> fin_freq: 850000000 Hz
> period=30518 cnt=25940
> duty=15259 duty_cnt=12970
> 
> Then I tried period = 12218, duty cycle = 0 (typically used for the
> highest CPU voltage):
> before your patches / after applying your patches:
> PWM: duty cycle: 0.338983% / n/a (constant low output)
> PWM: period: 12.3 µs / n/a
> Timing: Time: 12.250 µs (81.633 kHz) / n/a
> Timing: Average: 6.148 µs (162.668 kHz) / n/a
> driver debug messages with your patches applied:
> fin_freq: 850000000 Hz
> period=12218 cnt=10385
> 
With a 850MHz input clock we should see a 0.01% duty cycle with 1.2ns
clock pulses. Can we rule out an issue with the measuring equipment?
Is your logic analyzer able to display such short clock pulses?

> Finally I tried period = 12218, duty cycle = 12218 (typically used for
> the lowest CPU voltage):
> before your patches / after applying your patches:
> PWM: duty cycle: 99.661017% / n/a (constant low output)
> PWM: period: 12.3 µs / n/a
> Timing: Time: 12.250 µs (81.633 kHz) / n/a
> Timing: Average: 6.148 µs (162.668 kHz) / n/a
> driver debug messages with your patches applied:
> fin_freq: 850000000 Hz
> period=12218 cnt=10385
> 
Here I have no idea yet.

> After seeing the constant low output with period 12218 I realized that
> my previous test was no good: the CPU was fed the highest possible
> voltage all the time.
> It's not clear to me why period 12218 would give no PWM output at all
> while period 30518 works fine.
> I did an experiment by removing CLK_SET_RATE_PARENT from the divider's
> init.flags -> now XTAL (24MHz) is the only possible clock (it's the
> hardware default). It does indeed bring back the exact same results as
> before (where the XTAL clock was also used; with the changes from this
> series FCLK_DIV3 is now chosen, which runs at 850MHz).
> 
> Do you have any idea what could cause this?
> The FCLK_DIV3 input seems to work as otherwise period 30518 would also not work.
> The calculated values also look sane, so it's not that we have some
> 32-bit overflow (as I'm testing on a 32-bit Meson8b SoC).
> 
At first I'd like to verify that the registers have the expected values.
Can you provide the values of PWM_A/B (depending on which channel is used in your
case) and PWM_MISC_AB at the end of meson_pwm_enable()? Thanks!

> 
> Best regards,
> Martin

Heiner
Martin Blumenstingl April 16, 2023, 7:26 p.m. UTC | #3
Hi Heiner,

On Sat, Apr 15, 2023 at 8:39 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>
> On 14.04.2023 21:39, Martin Blumenstingl wrote:
> > Hello Heiner,
> >
> > On Thu, Apr 13, 2023 at 7:55 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
> > [...]
> >> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> > Unfortunately I have some bad news and I need to take back my Tested-by :-(
> > Previously my test was: cycle through all available CPU frequencies
> > while stressing the CPU.
> > My assumption was: if the system doesn't lock up everything's fine
> > because we have a high enough voltage.
> >
> > This evening however I got a memory corruption error while trying to
> > log in via UART - which I thought was strange.
> > So I connected my logic analyzer to my Odroid-C1 and did some experiments:
> >
> > period = 30518, duty cycle = 15259 (typically used for the 32kHz
> > output to the SDIO wifi chip)
> > before your patches / after applying your patches:
> > PWM: duty cycle: 50.000000% / 50.000000%
> > PWM: period: 30.6 µs / 30.5 µs
> > Timing: Time: 15.292 µs (65.395 kHz) / 15.250 µs (65.574 kHz)
> > Timing: Average: 15.296 µs (65.377 kHz) / 15.264 µs (65.513 kHz)
> > driver debug messages with your patches applied:
> > fin_freq: 850000000 Hz
> > period=30518 cnt=25940
> > duty=15259 duty_cnt=12970
> >
> > Then I tried period = 12218, duty cycle = 0 (typically used for the
> > highest CPU voltage):
> > before your patches / after applying your patches:
> > PWM: duty cycle: 0.338983% / n/a (constant low output)
> > PWM: period: 12.3 µs / n/a
> > Timing: Time: 12.250 µs (81.633 kHz) / n/a
> > Timing: Average: 6.148 µs (162.668 kHz) / n/a
> > driver debug messages with your patches applied:
> > fin_freq: 850000000 Hz
> > period=12218 cnt=10385
> >
> With a 850MHz input clock we should see a 0.01% duty cycle with 1.2ns
> clock pulses. Can we rule out an issue with the measuring equipment?
> Is your logic analyzer able to display such short clock pulses?
Oh, you're right: my logic analyzer maxes out at 24MHz (~42ns).
So we can ignore this case.

> > Finally I tried period = 12218, duty cycle = 12218 (typically used for
> > the lowest CPU voltage):
> > before your patches / after applying your patches:
> > PWM: duty cycle: 99.661017% / n/a (constant low output)
I have to correct myself: for this case my logic analyzer sees a:
constant high signal

> > PWM: period: 12.3 µs / n/a
> > Timing: Time: 12.250 µs (81.633 kHz) / n/a
> > Timing: Average: 6.148 µs (162.668 kHz) / n/a
> > driver debug messages with your patches applied:
> > fin_freq: 850000000 Hz
> > period=12218 cnt=10385
> >
> Here I have no idea yet.
[...]
> At first I'd like to verify that the registers have the expected values.
> Can you provide the values of PWM_A/B (depending on which channel is used in your
> case) and PWM_MISC_AB at the end of meson_pwm_enable()? Thanks!
I'm testing with PWM_B and I get:
  REG_MISC_AB = 0x008000c2, channel reg = 0x28910000

This register value looks correct to me.

This is now my last line in meson_pwm_enable() in case you want to
sanity-check what I did:
dev_err(meson->chip.dev, "REG_MISC_AB = 0x%08x, channel reg = 0x%08x",
value, readl(meson->base + channel_data->reg_offset));


Best regards,
Martin
Heiner Kallweit April 16, 2023, 9:34 p.m. UTC | #4
On 16.04.2023 21:26, Martin Blumenstingl wrote:
> Hi Heiner,
> 
> On Sat, Apr 15, 2023 at 8:39 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>
>> On 14.04.2023 21:39, Martin Blumenstingl wrote:
>>> Hello Heiner,
>>>
>>> On Thu, Apr 13, 2023 at 7:55 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>> [...]
>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>> Unfortunately I have some bad news and I need to take back my Tested-by :-(
>>> Previously my test was: cycle through all available CPU frequencies
>>> while stressing the CPU.
>>> My assumption was: if the system doesn't lock up everything's fine
>>> because we have a high enough voltage.
>>>
>>> This evening however I got a memory corruption error while trying to
>>> log in via UART - which I thought was strange.
>>> So I connected my logic analyzer to my Odroid-C1 and did some experiments:
>>>
>>> period = 30518, duty cycle = 15259 (typically used for the 32kHz
>>> output to the SDIO wifi chip)
>>> before your patches / after applying your patches:
>>> PWM: duty cycle: 50.000000% / 50.000000%
>>> PWM: period: 30.6 µs / 30.5 µs
>>> Timing: Time: 15.292 µs (65.395 kHz) / 15.250 µs (65.574 kHz)
>>> Timing: Average: 15.296 µs (65.377 kHz) / 15.264 µs (65.513 kHz)
>>> driver debug messages with your patches applied:
>>> fin_freq: 850000000 Hz
>>> period=30518 cnt=25940
>>> duty=15259 duty_cnt=12970
>>>
>>> Then I tried period = 12218, duty cycle = 0 (typically used for the
>>> highest CPU voltage):
>>> before your patches / after applying your patches:
>>> PWM: duty cycle: 0.338983% / n/a (constant low output)
>>> PWM: period: 12.3 µs / n/a
>>> Timing: Time: 12.250 µs (81.633 kHz) / n/a
>>> Timing: Average: 6.148 µs (162.668 kHz) / n/a
>>> driver debug messages with your patches applied:
>>> fin_freq: 850000000 Hz
>>> period=12218 cnt=10385
>>>
>> With a 850MHz input clock we should see a 0.01% duty cycle with 1.2ns
>> clock pulses. Can we rule out an issue with the measuring equipment?
>> Is your logic analyzer able to display such short clock pulses?
> Oh, you're right: my logic analyzer maxes out at 24MHz (~42ns).
> So we can ignore this case.
> 
>>> Finally I tried period = 12218, duty cycle = 12218 (typically used for
>>> the lowest CPU voltage):
>>> before your patches / after applying your patches:
>>> PWM: duty cycle: 99.661017% / n/a (constant low output)
> I have to correct myself: for this case my logic analyzer sees a:
> constant high signal
> 
So conclusion is that the PWM output is as expected? If yes, then the
memory corruption you saw supposedly had another root cause?
Eventually your Tested-by could be re-instantiated?

>>> PWM: period: 12.3 µs / n/a
>>> Timing: Time: 12.250 µs (81.633 kHz) / n/a
>>> Timing: Average: 6.148 µs (162.668 kHz) / n/a
>>> driver debug messages with your patches applied:
>>> fin_freq: 850000000 Hz
>>> period=12218 cnt=10385
>>>
>> Here I have no idea yet.
> [...]
>> At first I'd like to verify that the registers have the expected values.
>> Can you provide the values of PWM_A/B (depending on which channel is used in your
>> case) and PWM_MISC_AB at the end of meson_pwm_enable()? Thanks!
> I'm testing with PWM_B and I get:
>   REG_MISC_AB = 0x008000c2, channel reg = 0x28910000
> 
> This register value looks correct to me.
> 
To me as well.

> This is now my last line in meson_pwm_enable() in case you want to
> sanity-check what I did:
> dev_err(meson->chip.dev, "REG_MISC_AB = 0x%08x, channel reg = 0x%08x",
> value, readl(meson->base + channel_data->reg_offset));
> 
> 
> Best regards,
> Martin

Heiner
Neil Armstrong April 17, 2023, 7:23 a.m. UTC | #5
On 13/04/2023 07:54, Heiner Kallweit wrote:
> Newer versions of the PWM block use a core clock with external mux,
> divider, and gate. These components either don't exist any longer in
> the PWM block, or they are bypassed.
> To minimize needed changes for supporting the new version, the internal
> divider and gate should be handled by CCF too.
> 
> I didn't see a good way to split the patch, therefore it's somewhat
> bigger. What it does:
> 
> - The internal mux is handled by CCF already. Register also internal
>    divider and gate with CCF, so that we have one representation of the
>    input clock: [mux] parent of [divider] parent of [gate]
>    
> - Now that CCF selects an appropriate mux parent, we don't need the
>    DT-provided default parent any longer. Accordingly we can also omit
>    setting the mux parent directly in the driver.
>    
> - Instead of manually handling the pre-div divider value, let CCF
>    set the input clock. Targeted input clock frequency is
>    0xffff * 1/period for best precision.
>    
> - For the "inverted pwm disabled" scenario target an input clock
>    frequency of 1GHz. This ensures that the remaining low pulses
>    have minimum length.
> 
> I don't have hw with the old PWM block, therefore I couldn't test this
> patch. With the not yet included extension for the new PWM block
> (channel->clk coming directly from get_clk(external_clk)) I didn't
> notice any problem. My system uses PWM for the CPU voltage regulator
> and for the SDIO 32kHz clock.
> 
> Note: The clock gate in the old PWM block is permanently disabled.
> This seems to indicate that it's not used by the new PWM block.
> 
> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
> Changes to RFT/RFC version:
> - use parent_hws instead of parent_names for div/gate clock
> - use devm_clk_hw_register where the struct clk * returned by
>    devm_clk_register isn't needed
> 
> v2:
> - add patch 1
> - add patch 3
> - switch to using clk_parent_data in all relevant places
> v3:
> - add flag CLK_IGNORE_UNUSED
> v4:
> - remove variable tmp in meson_pwm_get_state
> - don't use deprecated function devm_clk_register
> ---
>   drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>   1 file changed, 83 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
> index 40a8709ff..80ac71cbc 100644
> --- a/drivers/pwm/pwm-meson.c
> +++ b/drivers/pwm/pwm-meson.c
> @@ -51,7 +51,7 @@
>   #define REG_MISC_AB		0x8
>   #define MISC_B_CLK_EN		23
>   #define MISC_A_CLK_EN		15
> -#define MISC_CLK_DIV_MASK	0x7f
> +#define MISC_CLK_DIV_WIDTH	7
>   #define MISC_B_CLK_DIV_SHIFT	16
>   #define MISC_A_CLK_DIV_SHIFT	8
>   #define MISC_B_CLK_SEL_SHIFT	6
> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>   };
>   
>   struct meson_pwm_channel {
> +	unsigned long rate;
>   	unsigned int hi;
>   	unsigned int lo;
> -	u8 pre_div;
>   
> -	struct clk *clk_parent;
>   	struct clk_mux mux;
> +	struct clk_divider div;
> +	struct clk_gate gate;
>   	struct clk *clk;
>   };
>   
> @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>   	struct device *dev = chip->dev;
>   	int err;
>   
> -	if (channel->clk_parent) {
> -		err = clk_set_parent(channel->clk, channel->clk_parent);
> -		if (err < 0) {
> -			dev_err(dev, "failed to set parent %s for %s: %d\n",
> -				__clk_get_name(channel->clk_parent),
> -				__clk_get_name(channel->clk), err);
> -			return err;
> -		}
> -	}
> -
>   	err = clk_prepare_enable(channel->clk);
>   	if (err < 0) {
>   		dev_err(dev, "failed to enable clock %s: %d\n",
> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>   			  const struct pwm_state *state)
>   {
>   	struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
> -	unsigned int duty, period, pre_div, cnt, duty_cnt;
> +	unsigned int duty, period, cnt, duty_cnt;
>   	unsigned long fin_freq;
> +	u64 freq;
>   
>   	duty = state->duty_cycle;
>   	period = state->period;
> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>   	if (state->polarity == PWM_POLARITY_INVERSED)
>   		duty = period - duty;
>   
> -	fin_freq = clk_get_rate(channel->clk);
> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
> +	if (freq > ULONG_MAX)
> +		freq = ULONG_MAX;
> +
> +	fin_freq = clk_round_rate(channel->clk, freq);
>   	if (fin_freq == 0) {
>   		dev_err(meson->chip.dev, "invalid source clock frequency\n");
>   		return -EINVAL;
> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>   
>   	dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>   
> -	pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
> -	if (pre_div > MISC_CLK_DIV_MASK) {
> -		dev_err(meson->chip.dev, "unable to get period pre_div\n");
> -		return -EINVAL;
> -	}
> -
> -	cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
> +	cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>   	if (cnt > 0xffff) {
>   		dev_err(meson->chip.dev, "unable to get period cnt\n");
>   		return -EINVAL;
>   	}
>   
> -	dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
> -		pre_div, cnt);
> +	dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>   
>   	if (duty == period) {
> -		channel->pre_div = pre_div;
>   		channel->hi = cnt;
>   		channel->lo = 0;
>   	} else if (duty == 0) {
> -		channel->pre_div = pre_div;
>   		channel->hi = 0;
>   		channel->lo = cnt;
>   	} else {
> -		/* Then check is we can have the duty with the same pre_div */
> -		duty_cnt = div64_u64(fin_freq * (u64)duty,
> -				     NSEC_PER_SEC * (pre_div + 1));
> +		duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>   		if (duty_cnt > 0xffff) {
>   			dev_err(meson->chip.dev, "unable to get duty cycle\n");
>   			return -EINVAL;
>   		}
>   
> -		dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
> -			duty, pre_div, duty_cnt);
> +		dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>   
> -		channel->pre_div = pre_div;
>   		channel->hi = duty_cnt;
>   		channel->lo = cnt - duty_cnt;
>   	}
>   
> +	channel->rate = fin_freq;
> +
>   	return 0;
>   }
>   
> @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>   	struct meson_pwm_channel_data *channel_data;
>   	unsigned long flags;
>   	u32 value;
> +	int err;
>   
>   	channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>   
> -	spin_lock_irqsave(&meson->lock, flags);
> +	err = clk_set_rate(channel->clk, channel->rate);
> +	if (err)
> +		dev_err(meson->chip.dev, "setting clock rate failed\n");
>   
> -	value = readl(meson->base + REG_MISC_AB);
> -	value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
> -	value |= channel->pre_div << channel_data->clk_div_shift;
> -	value |= BIT(channel_data->clk_en_bit);
> -	writel(value, meson->base + REG_MISC_AB);
> +	spin_lock_irqsave(&meson->lock, flags);
>   
>   	value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>   		FIELD_PREP(PWM_LOW_MASK, channel->lo);
> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>   			/*
>   			 * This IP block revision doesn't have an "always high"
>   			 * setting which we can use for "inverted disabled".
> -			 * Instead we achieve this using the same settings
> -			 * that we use a pre_div of 0 (to get the shortest
> -			 * possible duration for one "count") and
> +			 * Instead we achieve this by setting an arbitrary,
> +			 * very high frequency, resulting in the shortest
> +			 * possible duration for one "count" and
>   			 * "period == duty_cycle". This results in a signal
>   			 * which is LOW for one "count", while being HIGH for
>   			 * the rest of the (so the signal is HIGH for slightly
>   			 * less than 100% of the period, but this is the best
>   			 * we can achieve).
>   			 */
> -			channel->pre_div = 0;
> +			channel->rate = 1000000000;
>   			channel->hi = ~0;
>   			channel->lo = 0;

This looks like a really bad idea... please don't do that and instead introduce
some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
did for SPI:
https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com

>   
> @@ -305,7 +289,6 @@ static unsigned int meson_pwm_cnt_to_ns(struct pwm_chip *chip,
>   	struct meson_pwm *meson = to_meson_pwm(chip);
>   	struct meson_pwm_channel *channel;
>   	unsigned long fin_freq;
> -	u32 fin_ns;
>   
>   	/* to_meson_pwm() can only be used after .get_state() is called */
>   	channel = &meson->channels[pwm->hwpwm];
> @@ -314,9 +297,7 @@ static unsigned int meson_pwm_cnt_to_ns(struct pwm_chip *chip,
>   	if (fin_freq == 0)
>   		return 0;
>   
> -	fin_ns = div_u64(NSEC_PER_SEC, fin_freq);
> -
> -	return cnt * fin_ns * (channel->pre_div + 1);
> +	return div_u64(NSEC_PER_SEC * (u64)cnt, fin_freq);
>   }
>   
>   static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
> @@ -325,7 +306,7 @@ static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
>   	struct meson_pwm *meson = to_meson_pwm(chip);
>   	struct meson_pwm_channel_data *channel_data;
>   	struct meson_pwm_channel *channel;
> -	u32 value, tmp;
> +	u32 value;
>   
>   	if (!state)
>   		return 0;
> @@ -335,11 +316,8 @@ static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
>   
>   	value = readl(meson->base + REG_MISC_AB);
>   
> -	tmp = BIT(channel_data->pwm_en_bit) | BIT(channel_data->clk_en_bit);
> -	state->enabled = (value & tmp) == tmp;
> -
> -	tmp = value >> channel_data->clk_div_shift;
> -	channel->pre_div = FIELD_GET(MISC_CLK_DIV_MASK, tmp);
> +	state->enabled = __clk_is_enabled(channel->clk) &&
> +			 value & BIT(channel_data->pwm_en_bit);
>   
>   	value = readl(meson->base + channel_data->reg_offset);
>   
> @@ -480,6 +458,7 @@ static int meson_pwm_init_channels(struct meson_pwm *meson)
>   
>   	for (i = 0; i < meson->chip.npwm; i++) {
>   		struct meson_pwm_channel *channel = &meson->channels[i];
> +		struct clk_parent_data div_parent = {}, gate_parent = {};
>   		struct clk_init_data init = {};
>   
>   		snprintf(name, sizeof(name), "%s#mux%u", dev_name(dev), i);
> @@ -499,18 +478,63 @@ static int meson_pwm_init_channels(struct meson_pwm *meson)
>   		channel->mux.table = NULL;
>   		channel->mux.hw.init = &init;
>   
> -		channel->clk = devm_clk_register(dev, &channel->mux.hw);
> -		if (IS_ERR(channel->clk)) {
> -			err = PTR_ERR(channel->clk);
> +		err = devm_clk_hw_register(dev, &channel->mux.hw);
> +		if (err) {
>   			dev_err(dev, "failed to register %s: %d\n", name, err);
>   			return err;
>   		}
>   
> -		snprintf(name, sizeof(name), "clkin%u", i);
> +		snprintf(name, sizeof(name), "%s#div%u", dev_name(dev), i);
> +
> +		init.name = name;
> +		init.ops = &clk_divider_ops;
> +		init.flags = CLK_SET_RATE_PARENT;
> +		div_parent.index = -1;
> +		div_parent.hw = &channel->mux.hw;
> +		init.parent_data = &div_parent;
> +		init.num_parents = 1;
> +
> +		channel->div.reg = meson->base + REG_MISC_AB;
> +		channel->div.shift = meson_pwm_per_channel_data[i].clk_div_shift;
> +		channel->div.width = MISC_CLK_DIV_WIDTH;
> +		channel->div.hw.init = &init;
> +		channel->div.flags = 0;
> +		channel->div.lock = &meson->lock;
> +
> +		err = devm_clk_hw_register(dev, &channel->div.hw);
> +		if (err) {
> +			dev_err(dev, "failed to register %s: %d\n", name, err);
> +			return err;
> +		}
> +
> +		snprintf(name, sizeof(name), "%s#gate%u", dev_name(dev), i);
> +
> +		init.name = name;
> +		init.ops = &clk_gate_ops;
> +		init.flags = CLK_SET_RATE_PARENT | CLK_IGNORE_UNUSED;
> +		gate_parent.index = -1;
> +		gate_parent.hw = &channel->div.hw;
> +		init.parent_data = &gate_parent;
> +		init.num_parents = 1;
> +
> +		channel->gate.reg = meson->base + REG_MISC_AB;
> +		channel->gate.bit_idx = meson_pwm_per_channel_data[i].clk_en_bit;
> +		channel->gate.hw.init = &init;
> +		channel->gate.flags = 0;
> +		channel->gate.lock = &meson->lock;
> +
> +		err = devm_clk_hw_register(dev, &channel->gate.hw);
> +		if (err) {
> +			dev_err(dev, "failed to register %s: %d\n", name, err);
> +			return err;
> +		}
>   
> -		channel->clk_parent = devm_clk_get_optional(dev, name);
> -		if (IS_ERR(channel->clk_parent))
> -			return PTR_ERR(channel->clk_parent);
> +		channel->clk = devm_clk_hw_get_clk(dev, &channel->gate.hw, NULL);
> +		if (IS_ERR(channel->clk)) {
> +			err = PTR_ERR(channel->clk);
> +			dev_err(dev, "failed to register %s: %d\n", name, err);
> +			return err;
> +		}
>   	}
>   
>   	return 0;
Thierry Reding April 17, 2023, 9:17 a.m. UTC | #6
On Mon, Apr 17, 2023 at 09:23:35AM +0200, Neil Armstrong wrote:
> On 13/04/2023 07:54, Heiner Kallweit wrote:
[...]
> > diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
[...]
> > @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> >   			/*
> >   			 * This IP block revision doesn't have an "always high"
> >   			 * setting which we can use for "inverted disabled".
> > -			 * Instead we achieve this using the same settings
> > -			 * that we use a pre_div of 0 (to get the shortest
> > -			 * possible duration for one "count") and
> > +			 * Instead we achieve this by setting an arbitrary,
> > +			 * very high frequency, resulting in the shortest
> > +			 * possible duration for one "count" and
> >   			 * "period == duty_cycle". This results in a signal
> >   			 * which is LOW for one "count", while being HIGH for
> >   			 * the rest of the (so the signal is HIGH for slightly
> >   			 * less than 100% of the period, but this is the best
> >   			 * we can achieve).
> >   			 */
> > -			channel->pre_div = 0;
> > +			channel->rate = 1000000000;
> >   			channel->hi = ~0;
> >   			channel->lo = 0;
> 
> This looks like a really bad idea... please don't do that and instead introduce
> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
> did for SPI:
> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com

Yeah, absolutely. If the pin controller can do that, that's the right
way to implement the desired behavior.

Thierry
Heiner Kallweit April 17, 2023, 9:53 a.m. UTC | #7
On 17.04.2023 09:23, Neil Armstrong wrote:
> On 13/04/2023 07:54, Heiner Kallweit wrote:
>> Newer versions of the PWM block use a core clock with external mux,
>> divider, and gate. These components either don't exist any longer in
>> the PWM block, or they are bypassed.
>> To minimize needed changes for supporting the new version, the internal
>> divider and gate should be handled by CCF too.
>>
>> I didn't see a good way to split the patch, therefore it's somewhat
>> bigger. What it does:
>>
>> - The internal mux is handled by CCF already. Register also internal
>>    divider and gate with CCF, so that we have one representation of the
>>    input clock: [mux] parent of [divider] parent of [gate]
>>    - Now that CCF selects an appropriate mux parent, we don't need the
>>    DT-provided default parent any longer. Accordingly we can also omit
>>    setting the mux parent directly in the driver.
>>    - Instead of manually handling the pre-div divider value, let CCF
>>    set the input clock. Targeted input clock frequency is
>>    0xffff * 1/period for best precision.
>>    - For the "inverted pwm disabled" scenario target an input clock
>>    frequency of 1GHz. This ensures that the remaining low pulses
>>    have minimum length.
>>
>> I don't have hw with the old PWM block, therefore I couldn't test this
>> patch. With the not yet included extension for the new PWM block
>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>> notice any problem. My system uses PWM for the CPU voltage regulator
>> and for the SDIO 32kHz clock.
>>
>> Note: The clock gate in the old PWM block is permanently disabled.
>> This seems to indicate that it's not used by the new PWM block.
>>
>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>> Changes to RFT/RFC version:
>> - use parent_hws instead of parent_names for div/gate clock
>> - use devm_clk_hw_register where the struct clk * returned by
>>    devm_clk_register isn't needed
>>
>> v2:
>> - add patch 1
>> - add patch 3
>> - switch to using clk_parent_data in all relevant places
>> v3:
>> - add flag CLK_IGNORE_UNUSED
>> v4:
>> - remove variable tmp in meson_pwm_get_state
>> - don't use deprecated function devm_clk_register
>> ---
>>   drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>   1 file changed, 83 insertions(+), 59 deletions(-)
>>
>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>> index 40a8709ff..80ac71cbc 100644
>> --- a/drivers/pwm/pwm-meson.c
>> +++ b/drivers/pwm/pwm-meson.c
>> @@ -51,7 +51,7 @@
>>   #define REG_MISC_AB        0x8
>>   #define MISC_B_CLK_EN        23
>>   #define MISC_A_CLK_EN        15
>> -#define MISC_CLK_DIV_MASK    0x7f
>> +#define MISC_CLK_DIV_WIDTH    7
>>   #define MISC_B_CLK_DIV_SHIFT    16
>>   #define MISC_A_CLK_DIV_SHIFT    8
>>   #define MISC_B_CLK_SEL_SHIFT    6
>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>   };
>>     struct meson_pwm_channel {
>> +    unsigned long rate;
>>       unsigned int hi;
>>       unsigned int lo;
>> -    u8 pre_div;
>>   -    struct clk *clk_parent;
>>       struct clk_mux mux;
>> +    struct clk_divider div;
>> +    struct clk_gate gate;
>>       struct clk *clk;
>>   };
>>   @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>       struct device *dev = chip->dev;
>>       int err;
>>   -    if (channel->clk_parent) {
>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>> -        if (err < 0) {
>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>> -                __clk_get_name(channel->clk_parent),
>> -                __clk_get_name(channel->clk), err);
>> -            return err;
>> -        }
>> -    }
>> -
>>       err = clk_prepare_enable(channel->clk);
>>       if (err < 0) {
>>           dev_err(dev, "failed to enable clock %s: %d\n",
>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>                 const struct pwm_state *state)
>>   {
>>       struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>> +    unsigned int duty, period, cnt, duty_cnt;
>>       unsigned long fin_freq;
>> +    u64 freq;
>>         duty = state->duty_cycle;
>>       period = state->period;
>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>       if (state->polarity == PWM_POLARITY_INVERSED)
>>           duty = period - duty;
>>   -    fin_freq = clk_get_rate(channel->clk);
>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>> +    if (freq > ULONG_MAX)
>> +        freq = ULONG_MAX;
>> +
>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>       if (fin_freq == 0) {
>>           dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>           return -EINVAL;
>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>         dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>   -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>> -        return -EINVAL;
>> -    }
>> -
>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>       if (cnt > 0xffff) {
>>           dev_err(meson->chip.dev, "unable to get period cnt\n");
>>           return -EINVAL;
>>       }
>>   -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>> -        pre_div, cnt);
>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>         if (duty == period) {
>> -        channel->pre_div = pre_div;
>>           channel->hi = cnt;
>>           channel->lo = 0;
>>       } else if (duty == 0) {
>> -        channel->pre_div = pre_div;
>>           channel->hi = 0;
>>           channel->lo = cnt;
>>       } else {
>> -        /* Then check is we can have the duty with the same pre_div */
>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>> -                     NSEC_PER_SEC * (pre_div + 1));
>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>           if (duty_cnt > 0xffff) {
>>               dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>               return -EINVAL;
>>           }
>>   -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>> -            duty, pre_div, duty_cnt);
>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>   -        channel->pre_div = pre_div;
>>           channel->hi = duty_cnt;
>>           channel->lo = cnt - duty_cnt;
>>       }
>>   +    channel->rate = fin_freq;
>> +
>>       return 0;
>>   }
>>   @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>       struct meson_pwm_channel_data *channel_data;
>>       unsigned long flags;
>>       u32 value;
>> +    int err;
>>         channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>   -    spin_lock_irqsave(&meson->lock, flags);
>> +    err = clk_set_rate(channel->clk, channel->rate);
>> +    if (err)
>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>   -    value = readl(meson->base + REG_MISC_AB);
>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>> -    value |= BIT(channel_data->clk_en_bit);
>> -    writel(value, meson->base + REG_MISC_AB);
>> +    spin_lock_irqsave(&meson->lock, flags);
>>         value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>           FIELD_PREP(PWM_LOW_MASK, channel->lo);
>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>               /*
>>                * This IP block revision doesn't have an "always high"
>>                * setting which we can use for "inverted disabled".
>> -             * Instead we achieve this using the same settings
>> -             * that we use a pre_div of 0 (to get the shortest
>> -             * possible duration for one "count") and
>> +             * Instead we achieve this by setting an arbitrary,
>> +             * very high frequency, resulting in the shortest
>> +             * possible duration for one "count" and
>>                * "period == duty_cycle". This results in a signal
>>                * which is LOW for one "count", while being HIGH for
>>                * the rest of the (so the signal is HIGH for slightly
>>                * less than 100% of the period, but this is the best
>>                * we can achieve).
>>                */
>> -            channel->pre_div = 0;
>> +            channel->rate = 1000000000;
>>               channel->hi = ~0;
>>               channel->lo = 0;
> 
> This looks like a really bad idea... please don't do that and instead introduce
> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
> did for SPI:
> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
> 

There's no behavior change in this patch set. So my understanding is that you'd
like to change the current behavior independent of the CCF-related changes.

For the updated PWM block (at least for S905X3, not sure with which family Amlogic
extended the PWM block) we have an integrated option to achieve constant low/high
output. It's just not implemented yet, it's something I may do as next step.
The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
increment of the hi/lo period. By supporting these bits we can achieve value 0
for hi/lo.
IMO that's easier than adding pinctrl handling. The remaining question would be
whether it's worth it to add pinctrl handling just for the legacy version of the
PWM block.
Neil Armstrong April 17, 2023, 9:59 a.m. UTC | #8
On 17/04/2023 11:53, Heiner Kallweit wrote:
> On 17.04.2023 09:23, Neil Armstrong wrote:
>> On 13/04/2023 07:54, Heiner Kallweit wrote:
>>> Newer versions of the PWM block use a core clock with external mux,
>>> divider, and gate. These components either don't exist any longer in
>>> the PWM block, or they are bypassed.
>>> To minimize needed changes for supporting the new version, the internal
>>> divider and gate should be handled by CCF too.
>>>
>>> I didn't see a good way to split the patch, therefore it's somewhat
>>> bigger. What it does:
>>>
>>> - The internal mux is handled by CCF already. Register also internal
>>>     divider and gate with CCF, so that we have one representation of the
>>>     input clock: [mux] parent of [divider] parent of [gate]
>>>     - Now that CCF selects an appropriate mux parent, we don't need the
>>>     DT-provided default parent any longer. Accordingly we can also omit
>>>     setting the mux parent directly in the driver.
>>>     - Instead of manually handling the pre-div divider value, let CCF
>>>     set the input clock. Targeted input clock frequency is
>>>     0xffff * 1/period for best precision.
>>>     - For the "inverted pwm disabled" scenario target an input clock
>>>     frequency of 1GHz. This ensures that the remaining low pulses
>>>     have minimum length.
>>>
>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>> patch. With the not yet included extension for the new PWM block
>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>> and for the SDIO 32kHz clock.
>>>
>>> Note: The clock gate in the old PWM block is permanently disabled.
>>> This seems to indicate that it's not used by the new PWM block.
>>>
>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>> ---
>>> Changes to RFT/RFC version:
>>> - use parent_hws instead of parent_names for div/gate clock
>>> - use devm_clk_hw_register where the struct clk * returned by
>>>     devm_clk_register isn't needed
>>>
>>> v2:
>>> - add patch 1
>>> - add patch 3
>>> - switch to using clk_parent_data in all relevant places
>>> v3:
>>> - add flag CLK_IGNORE_UNUSED
>>> v4:
>>> - remove variable tmp in meson_pwm_get_state
>>> - don't use deprecated function devm_clk_register
>>> ---
>>>    drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>>    1 file changed, 83 insertions(+), 59 deletions(-)
>>>
>>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>>> index 40a8709ff..80ac71cbc 100644
>>> --- a/drivers/pwm/pwm-meson.c
>>> +++ b/drivers/pwm/pwm-meson.c
>>> @@ -51,7 +51,7 @@
>>>    #define REG_MISC_AB        0x8
>>>    #define MISC_B_CLK_EN        23
>>>    #define MISC_A_CLK_EN        15
>>> -#define MISC_CLK_DIV_MASK    0x7f
>>> +#define MISC_CLK_DIV_WIDTH    7
>>>    #define MISC_B_CLK_DIV_SHIFT    16
>>>    #define MISC_A_CLK_DIV_SHIFT    8
>>>    #define MISC_B_CLK_SEL_SHIFT    6
>>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>>    };
>>>      struct meson_pwm_channel {
>>> +    unsigned long rate;
>>>        unsigned int hi;
>>>        unsigned int lo;
>>> -    u8 pre_div;
>>>    -    struct clk *clk_parent;
>>>        struct clk_mux mux;
>>> +    struct clk_divider div;
>>> +    struct clk_gate gate;
>>>        struct clk *clk;
>>>    };
>>>    @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>>        struct device *dev = chip->dev;
>>>        int err;
>>>    -    if (channel->clk_parent) {
>>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>>> -        if (err < 0) {
>>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>>> -                __clk_get_name(channel->clk_parent),
>>> -                __clk_get_name(channel->clk), err);
>>> -            return err;
>>> -        }
>>> -    }
>>> -
>>>        err = clk_prepare_enable(channel->clk);
>>>        if (err < 0) {
>>>            dev_err(dev, "failed to enable clock %s: %d\n",
>>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>                  const struct pwm_state *state)
>>>    {
>>>        struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>>> +    unsigned int duty, period, cnt, duty_cnt;
>>>        unsigned long fin_freq;
>>> +    u64 freq;
>>>          duty = state->duty_cycle;
>>>        period = state->period;
>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>        if (state->polarity == PWM_POLARITY_INVERSED)
>>>            duty = period - duty;
>>>    -    fin_freq = clk_get_rate(channel->clk);
>>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>> +    if (freq > ULONG_MAX)
>>> +        freq = ULONG_MAX;
>>> +
>>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>>        if (fin_freq == 0) {
>>>            dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>            return -EINVAL;
>>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>          dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>>    -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>>> -        return -EINVAL;
>>> -    }
>>> -
>>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>>        if (cnt > 0xffff) {
>>>            dev_err(meson->chip.dev, "unable to get period cnt\n");
>>>            return -EINVAL;
>>>        }
>>>    -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>>> -        pre_div, cnt);
>>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>>          if (duty == period) {
>>> -        channel->pre_div = pre_div;
>>>            channel->hi = cnt;
>>>            channel->lo = 0;
>>>        } else if (duty == 0) {
>>> -        channel->pre_div = pre_div;
>>>            channel->hi = 0;
>>>            channel->lo = cnt;
>>>        } else {
>>> -        /* Then check is we can have the duty with the same pre_div */
>>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>>> -                     NSEC_PER_SEC * (pre_div + 1));
>>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>>            if (duty_cnt > 0xffff) {
>>>                dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>>                return -EINVAL;
>>>            }
>>>    -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>>> -            duty, pre_div, duty_cnt);
>>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>>    -        channel->pre_div = pre_div;
>>>            channel->hi = duty_cnt;
>>>            channel->lo = cnt - duty_cnt;
>>>        }
>>>    +    channel->rate = fin_freq;
>>> +
>>>        return 0;
>>>    }
>>>    @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>>        struct meson_pwm_channel_data *channel_data;
>>>        unsigned long flags;
>>>        u32 value;
>>> +    int err;
>>>          channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>>    -    spin_lock_irqsave(&meson->lock, flags);
>>> +    err = clk_set_rate(channel->clk, channel->rate);
>>> +    if (err)
>>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>>    -    value = readl(meson->base + REG_MISC_AB);
>>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>>> -    value |= BIT(channel_data->clk_en_bit);
>>> -    writel(value, meson->base + REG_MISC_AB);
>>> +    spin_lock_irqsave(&meson->lock, flags);
>>>          value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>>            FIELD_PREP(PWM_LOW_MASK, channel->lo);
>>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>>                /*
>>>                 * This IP block revision doesn't have an "always high"
>>>                 * setting which we can use for "inverted disabled".
>>> -             * Instead we achieve this using the same settings
>>> -             * that we use a pre_div of 0 (to get the shortest
>>> -             * possible duration for one "count") and
>>> +             * Instead we achieve this by setting an arbitrary,
>>> +             * very high frequency, resulting in the shortest
>>> +             * possible duration for one "count" and
>>>                 * "period == duty_cycle". This results in a signal
>>>                 * which is LOW for one "count", while being HIGH for
>>>                 * the rest of the (so the signal is HIGH for slightly
>>>                 * less than 100% of the period, but this is the best
>>>                 * we can achieve).
>>>                 */
>>> -            channel->pre_div = 0;
>>> +            channel->rate = 1000000000;
>>>                channel->hi = ~0;
>>>                channel->lo = 0;
>>
>> This looks like a really bad idea... please don't do that and instead introduce
>> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
>> did for SPI:
>> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
>>
> 
> There's no behavior change in this patch set. So my understanding is that you'd
> like to change the current behavior independent of the CCF-related changes.

There's a behavior change since you change the calculation with a radically different
algorithm.

Neil

> 
> For the updated PWM block (at least for S905X3, not sure with which family Amlogic
> extended the PWM block) we have an integrated option to achieve constant low/high
> output. It's just not implemented yet, it's something I may do as next step.
> The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
> increment of the hi/lo period. By supporting these bits we can achieve value 0
> for hi/lo.
> IMO that's easier than adding pinctrl handling. The remaining question would be
> whether it's worth it to add pinctrl handling just for the legacy version of the
> PWM block.
>
Heiner Kallweit April 17, 2023, 10:36 a.m. UTC | #9
On 17.04.2023 11:59, neil.armstrong@linaro.org wrote:
> On 17/04/2023 11:53, Heiner Kallweit wrote:
>> On 17.04.2023 09:23, Neil Armstrong wrote:
>>> On 13/04/2023 07:54, Heiner Kallweit wrote:
>>>> Newer versions of the PWM block use a core clock with external mux,
>>>> divider, and gate. These components either don't exist any longer in
>>>> the PWM block, or they are bypassed.
>>>> To minimize needed changes for supporting the new version, the internal
>>>> divider and gate should be handled by CCF too.
>>>>
>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>> bigger. What it does:
>>>>
>>>> - The internal mux is handled by CCF already. Register also internal
>>>>     divider and gate with CCF, so that we have one representation of the
>>>>     input clock: [mux] parent of [divider] parent of [gate]
>>>>     - Now that CCF selects an appropriate mux parent, we don't need the
>>>>     DT-provided default parent any longer. Accordingly we can also omit
>>>>     setting the mux parent directly in the driver.
>>>>     - Instead of manually handling the pre-div divider value, let CCF
>>>>     set the input clock. Targeted input clock frequency is
>>>>     0xffff * 1/period for best precision.
>>>>     - For the "inverted pwm disabled" scenario target an input clock
>>>>     frequency of 1GHz. This ensures that the remaining low pulses
>>>>     have minimum length.
>>>>
>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>> patch. With the not yet included extension for the new PWM block
>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>> and for the SDIO 32kHz clock.
>>>>
>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>> This seems to indicate that it's not used by the new PWM block.
>>>>
>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>> ---
>>>> Changes to RFT/RFC version:
>>>> - use parent_hws instead of parent_names for div/gate clock
>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>     devm_clk_register isn't needed
>>>>
>>>> v2:
>>>> - add patch 1
>>>> - add patch 3
>>>> - switch to using clk_parent_data in all relevant places
>>>> v3:
>>>> - add flag CLK_IGNORE_UNUSED
>>>> v4:
>>>> - remove variable tmp in meson_pwm_get_state
>>>> - don't use deprecated function devm_clk_register
>>>> ---
>>>>    drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>>>    1 file changed, 83 insertions(+), 59 deletions(-)
>>>>
>>>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>>>> index 40a8709ff..80ac71cbc 100644
>>>> --- a/drivers/pwm/pwm-meson.c
>>>> +++ b/drivers/pwm/pwm-meson.c
>>>> @@ -51,7 +51,7 @@
>>>>    #define REG_MISC_AB        0x8
>>>>    #define MISC_B_CLK_EN        23
>>>>    #define MISC_A_CLK_EN        15
>>>> -#define MISC_CLK_DIV_MASK    0x7f
>>>> +#define MISC_CLK_DIV_WIDTH    7
>>>>    #define MISC_B_CLK_DIV_SHIFT    16
>>>>    #define MISC_A_CLK_DIV_SHIFT    8
>>>>    #define MISC_B_CLK_SEL_SHIFT    6
>>>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>>>    };
>>>>      struct meson_pwm_channel {
>>>> +    unsigned long rate;
>>>>        unsigned int hi;
>>>>        unsigned int lo;
>>>> -    u8 pre_div;
>>>>    -    struct clk *clk_parent;
>>>>        struct clk_mux mux;
>>>> +    struct clk_divider div;
>>>> +    struct clk_gate gate;
>>>>        struct clk *clk;
>>>>    };
>>>>    @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>>>        struct device *dev = chip->dev;
>>>>        int err;
>>>>    -    if (channel->clk_parent) {
>>>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>>>> -        if (err < 0) {
>>>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>>>> -                __clk_get_name(channel->clk_parent),
>>>> -                __clk_get_name(channel->clk), err);
>>>> -            return err;
>>>> -        }
>>>> -    }
>>>> -
>>>>        err = clk_prepare_enable(channel->clk);
>>>>        if (err < 0) {
>>>>            dev_err(dev, "failed to enable clock %s: %d\n",
>>>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>                  const struct pwm_state *state)
>>>>    {
>>>>        struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>>>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>>>> +    unsigned int duty, period, cnt, duty_cnt;
>>>>        unsigned long fin_freq;
>>>> +    u64 freq;
>>>>          duty = state->duty_cycle;
>>>>        period = state->period;
>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>        if (state->polarity == PWM_POLARITY_INVERSED)
>>>>            duty = period - duty;
>>>>    -    fin_freq = clk_get_rate(channel->clk);
>>>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>> +    if (freq > ULONG_MAX)
>>>> +        freq = ULONG_MAX;
>>>> +
>>>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>>>        if (fin_freq == 0) {
>>>>            dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>            return -EINVAL;
>>>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>          dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>>>    -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>>>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>>>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>>>> -        return -EINVAL;
>>>> -    }
>>>> -
>>>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>>>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>>>        if (cnt > 0xffff) {
>>>>            dev_err(meson->chip.dev, "unable to get period cnt\n");
>>>>            return -EINVAL;
>>>>        }
>>>>    -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>>>> -        pre_div, cnt);
>>>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>>>          if (duty == period) {
>>>> -        channel->pre_div = pre_div;
>>>>            channel->hi = cnt;
>>>>            channel->lo = 0;
>>>>        } else if (duty == 0) {
>>>> -        channel->pre_div = pre_div;
>>>>            channel->hi = 0;
>>>>            channel->lo = cnt;
>>>>        } else {
>>>> -        /* Then check is we can have the duty with the same pre_div */
>>>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>>>> -                     NSEC_PER_SEC * (pre_div + 1));
>>>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>>>            if (duty_cnt > 0xffff) {
>>>>                dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>>>                return -EINVAL;
>>>>            }
>>>>    -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>>>> -            duty, pre_div, duty_cnt);
>>>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>>>    -        channel->pre_div = pre_div;
>>>>            channel->hi = duty_cnt;
>>>>            channel->lo = cnt - duty_cnt;
>>>>        }
>>>>    +    channel->rate = fin_freq;
>>>> +
>>>>        return 0;
>>>>    }
>>>>    @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>>>        struct meson_pwm_channel_data *channel_data;
>>>>        unsigned long flags;
>>>>        u32 value;
>>>> +    int err;
>>>>          channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>>>    -    spin_lock_irqsave(&meson->lock, flags);
>>>> +    err = clk_set_rate(channel->clk, channel->rate);
>>>> +    if (err)
>>>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>>>    -    value = readl(meson->base + REG_MISC_AB);
>>>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>>>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>>>> -    value |= BIT(channel_data->clk_en_bit);
>>>> -    writel(value, meson->base + REG_MISC_AB);
>>>> +    spin_lock_irqsave(&meson->lock, flags);
>>>>          value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>>>            FIELD_PREP(PWM_LOW_MASK, channel->lo);
>>>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>>>                /*
>>>>                 * This IP block revision doesn't have an "always high"
>>>>                 * setting which we can use for "inverted disabled".
>>>> -             * Instead we achieve this using the same settings
>>>> -             * that we use a pre_div of 0 (to get the shortest
>>>> -             * possible duration for one "count") and
>>>> +             * Instead we achieve this by setting an arbitrary,
>>>> +             * very high frequency, resulting in the shortest
>>>> +             * possible duration for one "count" and
>>>>                 * "period == duty_cycle". This results in a signal
>>>>                 * which is LOW for one "count", while being HIGH for
>>>>                 * the rest of the (so the signal is HIGH for slightly
>>>>                 * less than 100% of the period, but this is the best
>>>>                 * we can achieve).
>>>>                 */
>>>> -            channel->pre_div = 0;
>>>> +            channel->rate = 1000000000;
>>>>                channel->hi = ~0;
>>>>                channel->lo = 0;
>>>
>>> This looks like a really bad idea... please don't do that and instead introduce
>>> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
>>> did for SPI:
>>> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
>>>
>>
>> There's no behavior change in this patch set. So my understanding is that you'd
>> like to change the current behavior independent of the CCF-related changes.
> 
> There's a behavior change since you change the calculation with a radically different
> algorithm.
> 
Setting an input clock rate of 1GHz will result in pre_div = 0 with (where available)
mux parent fdiv3 (850MHz). So it's the same duty cycle as before, just with a different
period. The applied logic is the same as before and as described in the comment
"get the shortest possible duration for one count"

> Neil
> 
>>
>> For the updated PWM block (at least for S905X3, not sure with which family Amlogic
>> extended the PWM block) we have an integrated option to achieve constant low/high
>> output. It's just not implemented yet, it's something I may do as next step.
>> The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
>> increment of the hi/lo period. By supporting these bits we can achieve value 0
>> for hi/lo.
>> IMO that's easier than adding pinctrl handling. The remaining question would be
>> whether it's worth it to add pinctrl handling just for the legacy version of the
>> PWM block.
>>
>
Neil Armstrong April 17, 2023, 12:21 p.m. UTC | #10
On 17/04/2023 12:36, Heiner Kallweit wrote:
> On 17.04.2023 11:59, neil.armstrong@linaro.org wrote:
>> On 17/04/2023 11:53, Heiner Kallweit wrote:
>>> On 17.04.2023 09:23, Neil Armstrong wrote:
>>>> On 13/04/2023 07:54, Heiner Kallweit wrote:
>>>>> Newer versions of the PWM block use a core clock with external mux,
>>>>> divider, and gate. These components either don't exist any longer in
>>>>> the PWM block, or they are bypassed.
>>>>> To minimize needed changes for supporting the new version, the internal
>>>>> divider and gate should be handled by CCF too.
>>>>>
>>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>>> bigger. What it does:
>>>>>
>>>>> - The internal mux is handled by CCF already. Register also internal
>>>>>      divider and gate with CCF, so that we have one representation of the
>>>>>      input clock: [mux] parent of [divider] parent of [gate]
>>>>>      - Now that CCF selects an appropriate mux parent, we don't need the
>>>>>      DT-provided default parent any longer. Accordingly we can also omit
>>>>>      setting the mux parent directly in the driver.
>>>>>      - Instead of manually handling the pre-div divider value, let CCF
>>>>>      set the input clock. Targeted input clock frequency is
>>>>>      0xffff * 1/period for best precision.
>>>>>      - For the "inverted pwm disabled" scenario target an input clock
>>>>>      frequency of 1GHz. This ensures that the remaining low pulses
>>>>>      have minimum length.
>>>>>
>>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>>> patch. With the not yet included extension for the new PWM block
>>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>>> and for the SDIO 32kHz clock.
>>>>>
>>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>>> This seems to indicate that it's not used by the new PWM block.
>>>>>
>>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>> ---
>>>>> Changes to RFT/RFC version:
>>>>> - use parent_hws instead of parent_names for div/gate clock
>>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>>      devm_clk_register isn't needed
>>>>>
>>>>> v2:
>>>>> - add patch 1
>>>>> - add patch 3
>>>>> - switch to using clk_parent_data in all relevant places
>>>>> v3:
>>>>> - add flag CLK_IGNORE_UNUSED
>>>>> v4:
>>>>> - remove variable tmp in meson_pwm_get_state
>>>>> - don't use deprecated function devm_clk_register
>>>>> ---
>>>>>     drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>>>>     1 file changed, 83 insertions(+), 59 deletions(-)
>>>>>
>>>>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>>>>> index 40a8709ff..80ac71cbc 100644
>>>>> --- a/drivers/pwm/pwm-meson.c
>>>>> +++ b/drivers/pwm/pwm-meson.c
>>>>> @@ -51,7 +51,7 @@
>>>>>     #define REG_MISC_AB        0x8
>>>>>     #define MISC_B_CLK_EN        23
>>>>>     #define MISC_A_CLK_EN        15
>>>>> -#define MISC_CLK_DIV_MASK    0x7f
>>>>> +#define MISC_CLK_DIV_WIDTH    7
>>>>>     #define MISC_B_CLK_DIV_SHIFT    16
>>>>>     #define MISC_A_CLK_DIV_SHIFT    8
>>>>>     #define MISC_B_CLK_SEL_SHIFT    6
>>>>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>>>>     };
>>>>>       struct meson_pwm_channel {
>>>>> +    unsigned long rate;
>>>>>         unsigned int hi;
>>>>>         unsigned int lo;
>>>>> -    u8 pre_div;
>>>>>     -    struct clk *clk_parent;
>>>>>         struct clk_mux mux;
>>>>> +    struct clk_divider div;
>>>>> +    struct clk_gate gate;
>>>>>         struct clk *clk;
>>>>>     };
>>>>>     @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>>>>         struct device *dev = chip->dev;
>>>>>         int err;
>>>>>     -    if (channel->clk_parent) {
>>>>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>>>>> -        if (err < 0) {
>>>>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>>>>> -                __clk_get_name(channel->clk_parent),
>>>>> -                __clk_get_name(channel->clk), err);
>>>>> -            return err;
>>>>> -        }
>>>>> -    }
>>>>> -
>>>>>         err = clk_prepare_enable(channel->clk);
>>>>>         if (err < 0) {
>>>>>             dev_err(dev, "failed to enable clock %s: %d\n",
>>>>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>                   const struct pwm_state *state)
>>>>>     {
>>>>>         struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>>>>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>>>>> +    unsigned int duty, period, cnt, duty_cnt;
>>>>>         unsigned long fin_freq;
>>>>> +    u64 freq;
>>>>>           duty = state->duty_cycle;
>>>>>         period = state->period;
>>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>         if (state->polarity == PWM_POLARITY_INVERSED)
>>>>>             duty = period - duty;
>>>>>     -    fin_freq = clk_get_rate(channel->clk);
>>>>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>>> +    if (freq > ULONG_MAX)
>>>>> +        freq = ULONG_MAX;
>>>>> +
>>>>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>>>>         if (fin_freq == 0) {
>>>>>             dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>>             return -EINVAL;
>>>>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>           dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>>>>     -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>>>>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>>>>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>>>>> -        return -EINVAL;
>>>>> -    }
>>>>> -
>>>>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>>>>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>>>>         if (cnt > 0xffff) {
>>>>>             dev_err(meson->chip.dev, "unable to get period cnt\n");
>>>>>             return -EINVAL;
>>>>>         }
>>>>>     -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>>>>> -        pre_div, cnt);
>>>>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>>>>           if (duty == period) {
>>>>> -        channel->pre_div = pre_div;
>>>>>             channel->hi = cnt;
>>>>>             channel->lo = 0;
>>>>>         } else if (duty == 0) {
>>>>> -        channel->pre_div = pre_div;
>>>>>             channel->hi = 0;
>>>>>             channel->lo = cnt;
>>>>>         } else {
>>>>> -        /* Then check is we can have the duty with the same pre_div */
>>>>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>>>>> -                     NSEC_PER_SEC * (pre_div + 1));
>>>>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>>>>             if (duty_cnt > 0xffff) {
>>>>>                 dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>>>>                 return -EINVAL;
>>>>>             }
>>>>>     -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>>>>> -            duty, pre_div, duty_cnt);
>>>>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>>>>     -        channel->pre_div = pre_div;
>>>>>             channel->hi = duty_cnt;
>>>>>             channel->lo = cnt - duty_cnt;
>>>>>         }
>>>>>     +    channel->rate = fin_freq;
>>>>> +
>>>>>         return 0;
>>>>>     }
>>>>>     @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>>>>         struct meson_pwm_channel_data *channel_data;
>>>>>         unsigned long flags;
>>>>>         u32 value;
>>>>> +    int err;
>>>>>           channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>>>>     -    spin_lock_irqsave(&meson->lock, flags);
>>>>> +    err = clk_set_rate(channel->clk, channel->rate);
>>>>> +    if (err)
>>>>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>>>>     -    value = readl(meson->base + REG_MISC_AB);
>>>>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>>>>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>>>>> -    value |= BIT(channel_data->clk_en_bit);
>>>>> -    writel(value, meson->base + REG_MISC_AB);
>>>>> +    spin_lock_irqsave(&meson->lock, flags);
>>>>>           value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>>>>             FIELD_PREP(PWM_LOW_MASK, channel->lo);
>>>>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>>>>                 /*
>>>>>                  * This IP block revision doesn't have an "always high"
>>>>>                  * setting which we can use for "inverted disabled".
>>>>> -             * Instead we achieve this using the same settings
>>>>> -             * that we use a pre_div of 0 (to get the shortest
>>>>> -             * possible duration for one "count") and
>>>>> +             * Instead we achieve this by setting an arbitrary,
>>>>> +             * very high frequency, resulting in the shortest
>>>>> +             * possible duration for one "count" and
>>>>>                  * "period == duty_cycle". This results in a signal
>>>>>                  * which is LOW for one "count", while being HIGH for
>>>>>                  * the rest of the (so the signal is HIGH for slightly
>>>>>                  * less than 100% of the period, but this is the best
>>>>>                  * we can achieve).
>>>>>                  */
>>>>> -            channel->pre_div = 0;
>>>>> +            channel->rate = 1000000000;
>>>>>                 channel->hi = ~0;
>>>>>                 channel->lo = 0;
>>>>
>>>> This looks like a really bad idea... please don't do that and instead introduce
>>>> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
>>>> did for SPI:
>>>> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
>>>>
>>>
>>> There's no behavior change in this patch set. So my understanding is that you'd
>>> like to change the current behavior independent of the CCF-related changes.
>>
>> There's a behavior change since you change the calculation with a radically different
>> algorithm.
>>
> Setting an input clock rate of 1GHz will result in pre_div = 0 with (where available)
> mux parent fdiv3 (850MHz). So it's the same duty cycle as before, just with a different
> period. The applied logic is the same as before and as described in the comment
> "get the shortest possible duration for one count"

This is a hack based on current clock values, either explicitly support a code path
where pre_div = 0 or if you can't do that with CCF implement the pinctrl way to handle this,
which is the cleanest.

Neil

> 
>> Neil
>>
>>>
>>> For the updated PWM block (at least for S905X3, not sure with which family Amlogic
>>> extended the PWM block) we have an integrated option to achieve constant low/high
>>> output. It's just not implemented yet, it's something I may do as next step.
>>> The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
>>> increment of the hi/lo period. By supporting these bits we can achieve value 0
>>> for hi/lo.
>>> IMO that's easier than adding pinctrl handling. The remaining question would be
>>> whether it's worth it to add pinctrl handling just for the legacy version of the
>>> PWM block.
>>>
>>
>
Heiner Kallweit April 19, 2023, 7:58 p.m. UTC | #11
On 17.04.2023 14:21, neil.armstrong@linaro.org wrote:
> On 17/04/2023 12:36, Heiner Kallweit wrote:
>> On 17.04.2023 11:59, neil.armstrong@linaro.org wrote:
>>> On 17/04/2023 11:53, Heiner Kallweit wrote:
>>>> On 17.04.2023 09:23, Neil Armstrong wrote:
>>>>> On 13/04/2023 07:54, Heiner Kallweit wrote:
>>>>>> Newer versions of the PWM block use a core clock with external mux,
>>>>>> divider, and gate. These components either don't exist any longer in
>>>>>> the PWM block, or they are bypassed.
>>>>>> To minimize needed changes for supporting the new version, the internal
>>>>>> divider and gate should be handled by CCF too.
>>>>>>
>>>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>>>> bigger. What it does:
>>>>>>
>>>>>> - The internal mux is handled by CCF already. Register also internal
>>>>>>      divider and gate with CCF, so that we have one representation of the
>>>>>>      input clock: [mux] parent of [divider] parent of [gate]
>>>>>>      - Now that CCF selects an appropriate mux parent, we don't need the
>>>>>>      DT-provided default parent any longer. Accordingly we can also omit
>>>>>>      setting the mux parent directly in the driver.
>>>>>>      - Instead of manually handling the pre-div divider value, let CCF
>>>>>>      set the input clock. Targeted input clock frequency is
>>>>>>      0xffff * 1/period for best precision.
>>>>>>      - For the "inverted pwm disabled" scenario target an input clock
>>>>>>      frequency of 1GHz. This ensures that the remaining low pulses
>>>>>>      have minimum length.
>>>>>>
>>>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>>>> patch. With the not yet included extension for the new PWM block
>>>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>>>> and for the SDIO 32kHz clock.
>>>>>>
>>>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>>>> This seems to indicate that it's not used by the new PWM block.
>>>>>>
>>>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>> ---
>>>>>> Changes to RFT/RFC version:
>>>>>> - use parent_hws instead of parent_names for div/gate clock
>>>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>>>      devm_clk_register isn't needed
>>>>>>
>>>>>> v2:
>>>>>> - add patch 1
>>>>>> - add patch 3
>>>>>> - switch to using clk_parent_data in all relevant places
>>>>>> v3:
>>>>>> - add flag CLK_IGNORE_UNUSED
>>>>>> v4:
>>>>>> - remove variable tmp in meson_pwm_get_state
>>>>>> - don't use deprecated function devm_clk_register
>>>>>> ---
>>>>>>     drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>>>>>     1 file changed, 83 insertions(+), 59 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>>>>>> index 40a8709ff..80ac71cbc 100644
>>>>>> --- a/drivers/pwm/pwm-meson.c
>>>>>> +++ b/drivers/pwm/pwm-meson.c
>>>>>> @@ -51,7 +51,7 @@
>>>>>>     #define REG_MISC_AB        0x8
>>>>>>     #define MISC_B_CLK_EN        23
>>>>>>     #define MISC_A_CLK_EN        15
>>>>>> -#define MISC_CLK_DIV_MASK    0x7f
>>>>>> +#define MISC_CLK_DIV_WIDTH    7
>>>>>>     #define MISC_B_CLK_DIV_SHIFT    16
>>>>>>     #define MISC_A_CLK_DIV_SHIFT    8
>>>>>>     #define MISC_B_CLK_SEL_SHIFT    6
>>>>>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>>>>>     };
>>>>>>       struct meson_pwm_channel {
>>>>>> +    unsigned long rate;
>>>>>>         unsigned int hi;
>>>>>>         unsigned int lo;
>>>>>> -    u8 pre_div;
>>>>>>     -    struct clk *clk_parent;
>>>>>>         struct clk_mux mux;
>>>>>> +    struct clk_divider div;
>>>>>> +    struct clk_gate gate;
>>>>>>         struct clk *clk;
>>>>>>     };
>>>>>>     @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>>>>>         struct device *dev = chip->dev;
>>>>>>         int err;
>>>>>>     -    if (channel->clk_parent) {
>>>>>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>>>>>> -        if (err < 0) {
>>>>>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>>>>>> -                __clk_get_name(channel->clk_parent),
>>>>>> -                __clk_get_name(channel->clk), err);
>>>>>> -            return err;
>>>>>> -        }
>>>>>> -    }
>>>>>> -
>>>>>>         err = clk_prepare_enable(channel->clk);
>>>>>>         if (err < 0) {
>>>>>>             dev_err(dev, "failed to enable clock %s: %d\n",
>>>>>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>                   const struct pwm_state *state)
>>>>>>     {
>>>>>>         struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>>>>>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>>>>>> +    unsigned int duty, period, cnt, duty_cnt;
>>>>>>         unsigned long fin_freq;
>>>>>> +    u64 freq;
>>>>>>           duty = state->duty_cycle;
>>>>>>         period = state->period;
>>>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>         if (state->polarity == PWM_POLARITY_INVERSED)
>>>>>>             duty = period - duty;
>>>>>>     -    fin_freq = clk_get_rate(channel->clk);
>>>>>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>>>> +    if (freq > ULONG_MAX)
>>>>>> +        freq = ULONG_MAX;
>>>>>> +
>>>>>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>>>>>         if (fin_freq == 0) {
>>>>>>             dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>>>             return -EINVAL;
>>>>>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>           dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>>>>>     -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>>>>>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>>>>>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>>>>>> -        return -EINVAL;
>>>>>> -    }
>>>>>> -
>>>>>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>>>>>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>>>>>         if (cnt > 0xffff) {
>>>>>>             dev_err(meson->chip.dev, "unable to get period cnt\n");
>>>>>>             return -EINVAL;
>>>>>>         }
>>>>>>     -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>>>>>> -        pre_div, cnt);
>>>>>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>>>>>           if (duty == period) {
>>>>>> -        channel->pre_div = pre_div;
>>>>>>             channel->hi = cnt;
>>>>>>             channel->lo = 0;
>>>>>>         } else if (duty == 0) {
>>>>>> -        channel->pre_div = pre_div;
>>>>>>             channel->hi = 0;
>>>>>>             channel->lo = cnt;
>>>>>>         } else {
>>>>>> -        /* Then check is we can have the duty with the same pre_div */
>>>>>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>>>>>> -                     NSEC_PER_SEC * (pre_div + 1));
>>>>>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>>>>>             if (duty_cnt > 0xffff) {
>>>>>>                 dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>>>>>                 return -EINVAL;
>>>>>>             }
>>>>>>     -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>>>>>> -            duty, pre_div, duty_cnt);
>>>>>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>>>>>     -        channel->pre_div = pre_div;
>>>>>>             channel->hi = duty_cnt;
>>>>>>             channel->lo = cnt - duty_cnt;
>>>>>>         }
>>>>>>     +    channel->rate = fin_freq;
>>>>>> +
>>>>>>         return 0;
>>>>>>     }
>>>>>>     @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>>>>>         struct meson_pwm_channel_data *channel_data;
>>>>>>         unsigned long flags;
>>>>>>         u32 value;
>>>>>> +    int err;
>>>>>>           channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>>>>>     -    spin_lock_irqsave(&meson->lock, flags);
>>>>>> +    err = clk_set_rate(channel->clk, channel->rate);
>>>>>> +    if (err)
>>>>>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>>>>>     -    value = readl(meson->base + REG_MISC_AB);
>>>>>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>>>>>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>>>>>> -    value |= BIT(channel_data->clk_en_bit);
>>>>>> -    writel(value, meson->base + REG_MISC_AB);
>>>>>> +    spin_lock_irqsave(&meson->lock, flags);
>>>>>>           value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>>>>>             FIELD_PREP(PWM_LOW_MASK, channel->lo);
>>>>>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>>>>>                 /*
>>>>>>                  * This IP block revision doesn't have an "always high"
>>>>>>                  * setting which we can use for "inverted disabled".
>>>>>> -             * Instead we achieve this using the same settings
>>>>>> -             * that we use a pre_div of 0 (to get the shortest
>>>>>> -             * possible duration for one "count") and
>>>>>> +             * Instead we achieve this by setting an arbitrary,
>>>>>> +             * very high frequency, resulting in the shortest
>>>>>> +             * possible duration for one "count" and
>>>>>>                  * "period == duty_cycle". This results in a signal
>>>>>>                  * which is LOW for one "count", while being HIGH for
>>>>>>                  * the rest of the (so the signal is HIGH for slightly
>>>>>>                  * less than 100% of the period, but this is the best
>>>>>>                  * we can achieve).
>>>>>>                  */
>>>>>> -            channel->pre_div = 0;
>>>>>> +            channel->rate = 1000000000;
>>>>>>                 channel->hi = ~0;
>>>>>>                 channel->lo = 0;
>>>>>
>>>>> This looks like a really bad idea... please don't do that and instead introduce
>>>>> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
>>>>> did for SPI:
>>>>> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
>>>>>
>>>>
>>>> There's no behavior change in this patch set. So my understanding is that you'd
>>>> like to change the current behavior independent of the CCF-related changes.
>>>
>>> There's a behavior change since you change the calculation with a radically different
>>> algorithm.
>>>
>> Setting an input clock rate of 1GHz will result in pre_div = 0 with (where available)
>> mux parent fdiv3 (850MHz). So it's the same duty cycle as before, just with a different
>> period. The applied logic is the same as before and as described in the comment
>> "get the shortest possible duration for one count"
> 
> This is a hack based on current clock values, either explicitly support a code path
> where pre_div = 0 or if you can't do that with CCF implement the pinctrl way to handle this,
> which is the cleanest.
> 
To make it explicit we could request ULONG_MAX as rate instead of 1GHz, this would imply
choosing mux parent with highest rate and pre_div = 0. Up to you whether this would be
acceptable.
AFAICS pinctrl would need quite some DTS changes, and it's not my area of expertise.
So it would be open who can implement this.

> Neil
> 
>>
>>> Neil
>>>
>>>>
>>>> For the updated PWM block (at least for S905X3, not sure with which family Amlogic
>>>> extended the PWM block) we have an integrated option to achieve constant low/high
>>>> output. It's just not implemented yet, it's something I may do as next step.
>>>> The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
>>>> increment of the hi/lo period. By supporting these bits we can achieve value 0
>>>> for hi/lo.
>>>> IMO that's easier than adding pinctrl handling. The remaining question would be
>>>> whether it's worth it to add pinctrl handling just for the legacy version of the
>>>> PWM block.
>>>>
>>>
>>
>
Neil Armstrong April 21, 2023, 7:39 a.m. UTC | #12
On 19/04/2023 21:58, Heiner Kallweit wrote:
> On 17.04.2023 14:21, neil.armstrong@linaro.org wrote:
>> On 17/04/2023 12:36, Heiner Kallweit wrote:
>>> On 17.04.2023 11:59, neil.armstrong@linaro.org wrote:
>>>> On 17/04/2023 11:53, Heiner Kallweit wrote:
>>>>> On 17.04.2023 09:23, Neil Armstrong wrote:
>>>>>> On 13/04/2023 07:54, Heiner Kallweit wrote:
>>>>>>> Newer versions of the PWM block use a core clock with external mux,
>>>>>>> divider, and gate. These components either don't exist any longer in
>>>>>>> the PWM block, or they are bypassed.
>>>>>>> To minimize needed changes for supporting the new version, the internal
>>>>>>> divider and gate should be handled by CCF too.
>>>>>>>
>>>>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>>>>> bigger. What it does:
>>>>>>>
>>>>>>> - The internal mux is handled by CCF already. Register also internal
>>>>>>>       divider and gate with CCF, so that we have one representation of the
>>>>>>>       input clock: [mux] parent of [divider] parent of [gate]
>>>>>>>       - Now that CCF selects an appropriate mux parent, we don't need the
>>>>>>>       DT-provided default parent any longer. Accordingly we can also omit
>>>>>>>       setting the mux parent directly in the driver.
>>>>>>>       - Instead of manually handling the pre-div divider value, let CCF
>>>>>>>       set the input clock. Targeted input clock frequency is
>>>>>>>       0xffff * 1/period for best precision.
>>>>>>>       - For the "inverted pwm disabled" scenario target an input clock
>>>>>>>       frequency of 1GHz. This ensures that the remaining low pulses
>>>>>>>       have minimum length.
>>>>>>>
>>>>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>>>>> patch. With the not yet included extension for the new PWM block
>>>>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>>>>> and for the SDIO 32kHz clock.
>>>>>>>
>>>>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>>>>> This seems to indicate that it's not used by the new PWM block.
>>>>>>>
>>>>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>>> ---
>>>>>>> Changes to RFT/RFC version:
>>>>>>> - use parent_hws instead of parent_names for div/gate clock
>>>>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>>>>       devm_clk_register isn't needed
>>>>>>>
>>>>>>> v2:
>>>>>>> - add patch 1
>>>>>>> - add patch 3
>>>>>>> - switch to using clk_parent_data in all relevant places
>>>>>>> v3:
>>>>>>> - add flag CLK_IGNORE_UNUSED
>>>>>>> v4:
>>>>>>> - remove variable tmp in meson_pwm_get_state
>>>>>>> - don't use deprecated function devm_clk_register
>>>>>>> ---
>>>>>>>      drivers/pwm/pwm-meson.c | 142 +++++++++++++++++++++++-----------------
>>>>>>>      1 file changed, 83 insertions(+), 59 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
>>>>>>> index 40a8709ff..80ac71cbc 100644
>>>>>>> --- a/drivers/pwm/pwm-meson.c
>>>>>>> +++ b/drivers/pwm/pwm-meson.c
>>>>>>> @@ -51,7 +51,7 @@
>>>>>>>      #define REG_MISC_AB        0x8
>>>>>>>      #define MISC_B_CLK_EN        23
>>>>>>>      #define MISC_A_CLK_EN        15
>>>>>>> -#define MISC_CLK_DIV_MASK    0x7f
>>>>>>> +#define MISC_CLK_DIV_WIDTH    7
>>>>>>>      #define MISC_B_CLK_DIV_SHIFT    16
>>>>>>>      #define MISC_A_CLK_DIV_SHIFT    8
>>>>>>>      #define MISC_B_CLK_SEL_SHIFT    6
>>>>>>> @@ -87,12 +87,13 @@ static struct meson_pwm_channel_data {
>>>>>>>      };
>>>>>>>        struct meson_pwm_channel {
>>>>>>> +    unsigned long rate;
>>>>>>>          unsigned int hi;
>>>>>>>          unsigned int lo;
>>>>>>> -    u8 pre_div;
>>>>>>>      -    struct clk *clk_parent;
>>>>>>>          struct clk_mux mux;
>>>>>>> +    struct clk_divider div;
>>>>>>> +    struct clk_gate gate;
>>>>>>>          struct clk *clk;
>>>>>>>      };
>>>>>>>      @@ -125,16 +126,6 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
>>>>>>>          struct device *dev = chip->dev;
>>>>>>>          int err;
>>>>>>>      -    if (channel->clk_parent) {
>>>>>>> -        err = clk_set_parent(channel->clk, channel->clk_parent);
>>>>>>> -        if (err < 0) {
>>>>>>> -            dev_err(dev, "failed to set parent %s for %s: %d\n",
>>>>>>> -                __clk_get_name(channel->clk_parent),
>>>>>>> -                __clk_get_name(channel->clk), err);
>>>>>>> -            return err;
>>>>>>> -        }
>>>>>>> -    }
>>>>>>> -
>>>>>>>          err = clk_prepare_enable(channel->clk);
>>>>>>>          if (err < 0) {
>>>>>>>              dev_err(dev, "failed to enable clock %s: %d\n",
>>>>>>> @@ -157,8 +148,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>>                    const struct pwm_state *state)
>>>>>>>      {
>>>>>>>          struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
>>>>>>> -    unsigned int duty, period, pre_div, cnt, duty_cnt;
>>>>>>> +    unsigned int duty, period, cnt, duty_cnt;
>>>>>>>          unsigned long fin_freq;
>>>>>>> +    u64 freq;
>>>>>>>            duty = state->duty_cycle;
>>>>>>>          period = state->period;
>>>>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>>          if (state->polarity == PWM_POLARITY_INVERSED)
>>>>>>>              duty = period - duty;
>>>>>>>      -    fin_freq = clk_get_rate(channel->clk);
>>>>>>> +    freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>>>>> +    if (freq > ULONG_MAX)
>>>>>>> +        freq = ULONG_MAX;
>>>>>>> +
>>>>>>> +    fin_freq = clk_round_rate(channel->clk, freq);
>>>>>>>          if (fin_freq == 0) {
>>>>>>>              dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>>>>              return -EINVAL;
>>>>>>> @@ -174,46 +170,35 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>>            dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
>>>>>>>      -    pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
>>>>>>> -    if (pre_div > MISC_CLK_DIV_MASK) {
>>>>>>> -        dev_err(meson->chip.dev, "unable to get period pre_div\n");
>>>>>>> -        return -EINVAL;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
>>>>>>> +    cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
>>>>>>>          if (cnt > 0xffff) {
>>>>>>>              dev_err(meson->chip.dev, "unable to get period cnt\n");
>>>>>>>              return -EINVAL;
>>>>>>>          }
>>>>>>>      -    dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
>>>>>>> -        pre_div, cnt);
>>>>>>> +    dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
>>>>>>>            if (duty == period) {
>>>>>>> -        channel->pre_div = pre_div;
>>>>>>>              channel->hi = cnt;
>>>>>>>              channel->lo = 0;
>>>>>>>          } else if (duty == 0) {
>>>>>>> -        channel->pre_div = pre_div;
>>>>>>>              channel->hi = 0;
>>>>>>>              channel->lo = cnt;
>>>>>>>          } else {
>>>>>>> -        /* Then check is we can have the duty with the same pre_div */
>>>>>>> -        duty_cnt = div64_u64(fin_freq * (u64)duty,
>>>>>>> -                     NSEC_PER_SEC * (pre_div + 1));
>>>>>>> +        duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
>>>>>>>              if (duty_cnt > 0xffff) {
>>>>>>>                  dev_err(meson->chip.dev, "unable to get duty cycle\n");
>>>>>>>                  return -EINVAL;
>>>>>>>              }
>>>>>>>      -        dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
>>>>>>> -            duty, pre_div, duty_cnt);
>>>>>>> +        dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
>>>>>>>      -        channel->pre_div = pre_div;
>>>>>>>              channel->hi = duty_cnt;
>>>>>>>              channel->lo = cnt - duty_cnt;
>>>>>>>          }
>>>>>>>      +    channel->rate = fin_freq;
>>>>>>> +
>>>>>>>          return 0;
>>>>>>>      }
>>>>>>>      @@ -223,16 +208,15 @@ static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
>>>>>>>          struct meson_pwm_channel_data *channel_data;
>>>>>>>          unsigned long flags;
>>>>>>>          u32 value;
>>>>>>> +    int err;
>>>>>>>            channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
>>>>>>>      -    spin_lock_irqsave(&meson->lock, flags);
>>>>>>> +    err = clk_set_rate(channel->clk, channel->rate);
>>>>>>> +    if (err)
>>>>>>> +        dev_err(meson->chip.dev, "setting clock rate failed\n");
>>>>>>>      -    value = readl(meson->base + REG_MISC_AB);
>>>>>>> -    value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
>>>>>>> -    value |= channel->pre_div << channel_data->clk_div_shift;
>>>>>>> -    value |= BIT(channel_data->clk_en_bit);
>>>>>>> -    writel(value, meson->base + REG_MISC_AB);
>>>>>>> +    spin_lock_irqsave(&meson->lock, flags);
>>>>>>>            value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
>>>>>>>              FIELD_PREP(PWM_LOW_MASK, channel->lo);
>>>>>>> @@ -271,16 +255,16 @@ static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>>>>>>>                  /*
>>>>>>>                   * This IP block revision doesn't have an "always high"
>>>>>>>                   * setting which we can use for "inverted disabled".
>>>>>>> -             * Instead we achieve this using the same settings
>>>>>>> -             * that we use a pre_div of 0 (to get the shortest
>>>>>>> -             * possible duration for one "count") and
>>>>>>> +             * Instead we achieve this by setting an arbitrary,
>>>>>>> +             * very high frequency, resulting in the shortest
>>>>>>> +             * possible duration for one "count" and
>>>>>>>                   * "period == duty_cycle". This results in a signal
>>>>>>>                   * which is LOW for one "count", while being HIGH for
>>>>>>>                   * the rest of the (so the signal is HIGH for slightly
>>>>>>>                   * less than 100% of the period, but this is the best
>>>>>>>                   * we can achieve).
>>>>>>>                   */
>>>>>>> -            channel->pre_div = 0;
>>>>>>> +            channel->rate = 1000000000;
>>>>>>>                  channel->hi = ~0;
>>>>>>>                  channel->lo = 0;
>>>>>>
>>>>>> This looks like a really bad idea... please don't do that and instead introduce
>>>>>> some pinctrl states where we set the PWM pin as GPIO mode high/low state like we
>>>>>> did for SPI:
>>>>>> https://lore.kernel.org/r/20221004-up-aml-fix-spi-v4-2-0342d8e10c49@baylibre.com
>>>>>>
>>>>>
>>>>> There's no behavior change in this patch set. So my understanding is that you'd
>>>>> like to change the current behavior independent of the CCF-related changes.
>>>>
>>>> There's a behavior change since you change the calculation with a radically different
>>>> algorithm.
>>>>
>>> Setting an input clock rate of 1GHz will result in pre_div = 0 with (where available)
>>> mux parent fdiv3 (850MHz). So it's the same duty cycle as before, just with a different
>>> period. The applied logic is the same as before and as described in the comment
>>> "get the shortest possible duration for one count"
>>
>> This is a hack based on current clock values, either explicitly support a code path
>> where pre_div = 0 or if you can't do that with CCF implement the pinctrl way to handle this,
>> which is the cleanest.
>>
> To make it explicit we could request ULONG_MAX as rate instead of 1GHz, this would imply
> choosing mux parent with highest rate and pre_div = 0. Up to you whether this would be
> acceptable.

It would be better

> AFAICS pinctrl would need quite some DTS changes, and it's not my area of expertise.
> So it would be open who can implement this.

Not necessarily, it's optional for SPICC, it should be optional here aswell
and only used by code if present, otherwise the driver will do it's best to
satisfy. A warn_once could be added to signify pinctrl should be used to
have a real always high/low signal.

Neil


> 
>> Neil
>>
>>>
>>>> Neil
>>>>
>>>>>
>>>>> For the updated PWM block (at least for S905X3, not sure with which family Amlogic
>>>>> extended the PWM block) we have an integrated option to achieve constant low/high
>>>>> output. It's just not implemented yet, it's something I may do as next step.
>>>>> The extended PWM block added new bits pwm_A/B_constant_en that prevent the default
>>>>> increment of the hi/lo period. By supporting these bits we can achieve value 0
>>>>> for hi/lo.
>>>>> IMO that's easier than adding pinctrl handling. The remaining question would be
>>>>> whether it's worth it to add pinctrl handling just for the legacy version of the
>>>>> PWM block.
>>>>>
>>>>
>>>
>>
>
Martin Blumenstingl April 23, 2023, 8:55 p.m. UTC | #13
Hello Heiner,

apologies for the late reply - I've been busy with offline things.

On Sun, Apr 16, 2023 at 11:34 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
[...]
> >> With a 850MHz input clock we should see a 0.01% duty cycle with 1.2ns
> >> clock pulses. Can we rule out an issue with the measuring equipment?
> >> Is your logic analyzer able to display such short clock pulses?
> > Oh, you're right: my logic analyzer maxes out at 24MHz (~42ns).
> > So we can ignore this case.
> >
> >>> Finally I tried period = 12218, duty cycle = 12218 (typically used for
> >>> the lowest CPU voltage):
> >>> before your patches / after applying your patches:
> >>> PWM: duty cycle: 99.661017% / n/a (constant low output)
> > I have to correct myself: for this case my logic analyzer sees a:
> > constant high signal
> >
> So conclusion is that the PWM output is as expected? If yes, then the
> memory corruption you saw supposedly had another root cause?
You are right:
- For this case my logic analyzer is also too slow
- In the meantime I've been able to reproduce the memory corruption
issue without your patch

> Eventually your Tested-by could be re-instantiated?
Indeed, and in addition to SM1 I have also tested it on Khadas VIM2 (GXM SoC):
[    4.135944] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4356/2
wl0: Apr  9 2021 00:40:07 version 7.35.349.104 (775a9ab CY) FWID
01-64b609e0

Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>


Best regards,
Martin
Martin Blumenstingl April 23, 2023, 8:58 p.m. UTC | #14
On Wed, Apr 19, 2023 at 9:58 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
[...]
> > This is a hack based on current clock values, either explicitly support a code path
> > where pre_div = 0 or if you can't do that with CCF implement the pinctrl way to handle this,
> > which is the cleanest.
> >
> To make it explicit we could request ULONG_MAX as rate instead of 1GHz, this would imply
> choosing mux parent with highest rate and pre_div = 0. Up to you whether this would be
> acceptable.
I like the idea of using ULONG_MAX as I first had to think about why
you chose 1GHz in the driver.

> AFAICS pinctrl would need quite some DTS changes, and it's not my area of expertise.
> So it would be open who can implement this.
My opinion is that this can be done in a separate patch. We need to
work on this whole thing anyways as you mentioned that newer SoCs
(from what I understand: G12A onwards) have a dedicated "constant
output" bit which will make the pinctrl solution unnecessary (at least
based on how I understand it).


Best regards,
Martin
Heiner Kallweit May 1, 2023, 1:39 p.m. UTC | #15
On 23.04.2023 22:58, Martin Blumenstingl wrote:
> On Wed, Apr 19, 2023 at 9:58 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
> [...]
>>> This is a hack based on current clock values, either explicitly support a code path
>>> where pre_div = 0 or if you can't do that with CCF implement the pinctrl way to handle this,
>>> which is the cleanest.
>>>
>> To make it explicit we could request ULONG_MAX as rate instead of 1GHz, this would imply
>> choosing mux parent with highest rate and pre_div = 0. Up to you whether this would be
>> acceptable.
> I like the idea of using ULONG_MAX as I first had to think about why
> you chose 1GHz in the driver.
> 
>> AFAICS pinctrl would need quite some DTS changes, and it's not my area of expertise.
>> So it would be open who can implement this.
> My opinion is that this can be done in a separate patch. We need to
> work on this whole thing anyways as you mentioned that newer SoCs
> (from what I understand: G12A onwards) have a dedicated "constant
> output" bit which will make the pinctrl solution unnecessary (at least
> based on how I understand it).
> 
Agree.
My understanding of the "constant output" bit is, based on the vendor driver:
W/o this bit the chip internally increments the lo and hi value. Not sure by the way
how the chip handles value 0xffff, whether it omits the increment in this case.
W/ this bit set the chip doesn't increment the values, therefore lo / hi can be
effectively zero.

> 
> Best regards,
> Martin

Heiner
Dmitry Rokosov May 19, 2023, 3:30 p.m. UTC | #16
Hello Heiner,

Thank you for the patch series!

I am currently working on the Amlogic A1 clock driver and other
peripheral devices, including PWM. During a discussion about the clock
driver with Martin Blumenstingl, we found an intersection between the
clock driver and your PWM CCF support patch series. Please see my
comments below.

On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
> Newer versions of the PWM block use a core clock with external mux,
> divider, and gate. These components either don't exist any longer in
> the PWM block, or they are bypassed.
> To minimize needed changes for supporting the new version, the internal
> divider and gate should be handled by CCF too.
> 
> I didn't see a good way to split the patch, therefore it's somewhat
> bigger. What it does:
> 
> - The internal mux is handled by CCF already. Register also internal
>   divider and gate with CCF, so that we have one representation of the
>   input clock: [mux] parent of [divider] parent of [gate]
>   
> - Now that CCF selects an appropriate mux parent, we don't need the
>   DT-provided default parent any longer. Accordingly we can also omit
>   setting the mux parent directly in the driver.
>   
> - Instead of manually handling the pre-div divider value, let CCF
>   set the input clock. Targeted input clock frequency is
>   0xffff * 1/period for best precision.
>   
> - For the "inverted pwm disabled" scenario target an input clock
>   frequency of 1GHz. This ensures that the remaining low pulses
>   have minimum length.
> 
> I don't have hw with the old PWM block, therefore I couldn't test this
> patch. With the not yet included extension for the new PWM block
> (channel->clk coming directly from get_clk(external_clk)) I didn't
> notice any problem. My system uses PWM for the CPU voltage regulator
> and for the SDIO 32kHz clock.
> 
> Note: The clock gate in the old PWM block is permanently disabled.
> This seems to indicate that it's not used by the new PWM block.
> 
> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
> Changes to RFT/RFC version:
> - use parent_hws instead of parent_names for div/gate clock
> - use devm_clk_hw_register where the struct clk * returned by
>   devm_clk_register isn't needed
> 
> v2:
> - add patch 1
> - add patch 3
> - switch to using clk_parent_data in all relevant places
> v3:
> - add flag CLK_IGNORE_UNUSED
> v4:
> - remove variable tmp in meson_pwm_get_state
> - don't use deprecated function devm_clk_register

[...]

> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>  	if (state->polarity == PWM_POLARITY_INVERSED)
>  		duty = period - duty;
>  
> -	fin_freq = clk_get_rate(channel->clk);
> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
> +	if (freq > ULONG_MAX)
> +		freq = ULONG_MAX;
> +
> +	fin_freq = clk_round_rate(channel->clk, freq);
>  	if (fin_freq == 0) {
>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
>  		return -EINVAL;

As mentioned previously, we have discussed one optimization for PWM
parent clock calculation. Many modern Amlogic SoCs include an RTC clock
within the clock tree. This clock provides a stable and efficient 32kHz
input for several clock objects that can be inherited through the muxes
from the RTC clock.

In short, we aim to use the RTC clock parent directly for PWM to
generate a 32kHz clock on the PWM lines. Martin has suggested one way to
do so, which is described in [0]. You can also refer to our IRC
discussion in [1].

I would appreciate your thoughts on this. Please let me know what you
think.

[...]

Links:
    [0] https://lore.kernel.org/all/CAFBinCCPf+asVakAxeBqV-jhsZp=i2zbShByTCXfYYAQ6cCnHg@mail.gmail.com/
    [1] https://libera.irclog.whitequark.org/linux-amlogic/2023-05-18
Heiner Kallweit May 19, 2023, 4:53 p.m. UTC | #17
On 19.05.2023 17:30, Dmitry Rokosov wrote:
> Hello Heiner,
> 
> Thank you for the patch series!
> 
> I am currently working on the Amlogic A1 clock driver and other
> peripheral devices, including PWM. During a discussion about the clock
> driver with Martin Blumenstingl, we found an intersection between the
> clock driver and your PWM CCF support patch series. Please see my
> comments below.
> 
> On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
>> Newer versions of the PWM block use a core clock with external mux,
>> divider, and gate. These components either don't exist any longer in
>> the PWM block, or they are bypassed.
>> To minimize needed changes for supporting the new version, the internal
>> divider and gate should be handled by CCF too.
>>
>> I didn't see a good way to split the patch, therefore it's somewhat
>> bigger. What it does:
>>
>> - The internal mux is handled by CCF already. Register also internal
>>   divider and gate with CCF, so that we have one representation of the
>>   input clock: [mux] parent of [divider] parent of [gate]
>>   
>> - Now that CCF selects an appropriate mux parent, we don't need the
>>   DT-provided default parent any longer. Accordingly we can also omit
>>   setting the mux parent directly in the driver.
>>   
>> - Instead of manually handling the pre-div divider value, let CCF
>>   set the input clock. Targeted input clock frequency is
>>   0xffff * 1/period for best precision.
>>   
>> - For the "inverted pwm disabled" scenario target an input clock
>>   frequency of 1GHz. This ensures that the remaining low pulses
>>   have minimum length.
>>
>> I don't have hw with the old PWM block, therefore I couldn't test this
>> patch. With the not yet included extension for the new PWM block
>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>> notice any problem. My system uses PWM for the CPU voltage regulator
>> and for the SDIO 32kHz clock.
>>
>> Note: The clock gate in the old PWM block is permanently disabled.
>> This seems to indicate that it's not used by the new PWM block.
>>
>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>> Changes to RFT/RFC version:
>> - use parent_hws instead of parent_names for div/gate clock
>> - use devm_clk_hw_register where the struct clk * returned by
>>   devm_clk_register isn't needed
>>
>> v2:
>> - add patch 1
>> - add patch 3
>> - switch to using clk_parent_data in all relevant places
>> v3:
>> - add flag CLK_IGNORE_UNUSED
>> v4:
>> - remove variable tmp in meson_pwm_get_state
>> - don't use deprecated function devm_clk_register
> 
> [...]
> 
>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>  	if (state->polarity == PWM_POLARITY_INVERSED)
>>  		duty = period - duty;
>>  
>> -	fin_freq = clk_get_rate(channel->clk);
>> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>> +	if (freq > ULONG_MAX)
>> +		freq = ULONG_MAX;
>> +
>> +	fin_freq = clk_round_rate(channel->clk, freq);
>>  	if (fin_freq == 0) {
>>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>  		return -EINVAL;
> 
> As mentioned previously, we have discussed one optimization for PWM
> parent clock calculation. Many modern Amlogic SoCs include an RTC clock
> within the clock tree. This clock provides a stable and efficient 32kHz
> input for several clock objects that can be inherited through the muxes
> from the RTC clock.
> 
> In short, we aim to use the RTC clock parent directly for PWM to
> generate a 32kHz clock on the PWM lines. Martin has suggested one way to
> do so, which is described in [0]. You can also refer to our IRC
> discussion in [1].
> 
> I would appreciate your thoughts on this. Please let me know what you
> think.
> 

Requesting a frequency of (NSEC_PER_SEC * 0xffffULL / period) is based
on the assumption that the highest possible input frequency always
allows to generate a period that is close enough to the requested period.

To find the best parent you'd need a somewhat more complex logic outside CCF.
What you want is the parent where (f_parent * period / NSEC_PER_SEC) is
closest to an integer in the range 1 .. 0xffff.
IOW: max(abs((f_parent * period) % 10^9 - 5 * 10^8))

This can be done, question is whether it's needed and worth the effort.

This would be the generic solution. If you just want to handle the case
that period 1/32.768Hz is requested, an adjusted version of Martins's
pseudo-code formula should do.
Best I think as follow-up to my series.

> [...]
> 
> Links:
>     [0] https://lore.kernel.org/all/CAFBinCCPf+asVakAxeBqV-jhsZp=i2zbShByTCXfYYAQ6cCnHg@mail.gmail.com/
>     [1] https://libera.irclog.whitequark.org/linux-amlogic/2023-05-18
>
Dmitry Rokosov May 22, 2023, 1:37 p.m. UTC | #18
Heiner,

On Fri, May 19, 2023 at 06:53:30PM +0200, Heiner Kallweit wrote:
> On 19.05.2023 17:30, Dmitry Rokosov wrote:
> > Hello Heiner,
> > 
> > Thank you for the patch series!
> > 
> > I am currently working on the Amlogic A1 clock driver and other
> > peripheral devices, including PWM. During a discussion about the clock
> > driver with Martin Blumenstingl, we found an intersection between the
> > clock driver and your PWM CCF support patch series. Please see my
> > comments below.
> > 
> > On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
> >> Newer versions of the PWM block use a core clock with external mux,
> >> divider, and gate. These components either don't exist any longer in
> >> the PWM block, or they are bypassed.
> >> To minimize needed changes for supporting the new version, the internal
> >> divider and gate should be handled by CCF too.
> >>
> >> I didn't see a good way to split the patch, therefore it's somewhat
> >> bigger. What it does:
> >>
> >> - The internal mux is handled by CCF already. Register also internal
> >>   divider and gate with CCF, so that we have one representation of the
> >>   input clock: [mux] parent of [divider] parent of [gate]
> >>   
> >> - Now that CCF selects an appropriate mux parent, we don't need the
> >>   DT-provided default parent any longer. Accordingly we can also omit
> >>   setting the mux parent directly in the driver.
> >>   
> >> - Instead of manually handling the pre-div divider value, let CCF
> >>   set the input clock. Targeted input clock frequency is
> >>   0xffff * 1/period for best precision.
> >>   
> >> - For the "inverted pwm disabled" scenario target an input clock
> >>   frequency of 1GHz. This ensures that the remaining low pulses
> >>   have minimum length.
> >>
> >> I don't have hw with the old PWM block, therefore I couldn't test this
> >> patch. With the not yet included extension for the new PWM block
> >> (channel->clk coming directly from get_clk(external_clk)) I didn't
> >> notice any problem. My system uses PWM for the CPU voltage regulator
> >> and for the SDIO 32kHz clock.
> >>
> >> Note: The clock gate in the old PWM block is permanently disabled.
> >> This seems to indicate that it's not used by the new PWM block.
> >>
> >> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> >> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> >> ---
> >> Changes to RFT/RFC version:
> >> - use parent_hws instead of parent_names for div/gate clock
> >> - use devm_clk_hw_register where the struct clk * returned by
> >>   devm_clk_register isn't needed
> >>
> >> v2:
> >> - add patch 1
> >> - add patch 3
> >> - switch to using clk_parent_data in all relevant places
> >> v3:
> >> - add flag CLK_IGNORE_UNUSED
> >> v4:
> >> - remove variable tmp in meson_pwm_get_state
> >> - don't use deprecated function devm_clk_register
> > 
> > [...]
> > 
> >> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
> >>  	if (state->polarity == PWM_POLARITY_INVERSED)
> >>  		duty = period - duty;
> >>  
> >> -	fin_freq = clk_get_rate(channel->clk);
> >> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
> >> +	if (freq > ULONG_MAX)
> >> +		freq = ULONG_MAX;
> >> +
> >> +	fin_freq = clk_round_rate(channel->clk, freq);
> >>  	if (fin_freq == 0) {
> >>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
> >>  		return -EINVAL;
> > 
> > As mentioned previously, we have discussed one optimization for PWM
> > parent clock calculation. Many modern Amlogic SoCs include an RTC clock
> > within the clock tree. This clock provides a stable and efficient 32kHz
> > input for several clock objects that can be inherited through the muxes
> > from the RTC clock.
> > 
> > In short, we aim to use the RTC clock parent directly for PWM to
> > generate a 32kHz clock on the PWM lines. Martin has suggested one way to
> > do so, which is described in [0]. You can also refer to our IRC
> > discussion in [1].
> > 
> > I would appreciate your thoughts on this. Please let me know what you
> > think.
> > 
> 
> Requesting a frequency of (NSEC_PER_SEC * 0xffffULL / period) is based
> on the assumption that the highest possible input frequency always
> allows to generate a period that is close enough to the requested period.
> 
> To find the best parent you'd need a somewhat more complex logic outside CCF.
> What you want is the parent where (f_parent * period / NSEC_PER_SEC) is
> closest to an integer in the range 1 .. 0xffff.
> IOW: max(abs((f_parent * period) % 10^9 - 5 * 10^8))
> 
> This can be done, question is whether it's needed and worth the effort.
> 
> This would be the generic solution. If you just want to handle the case
> that period 1/32.768Hz is requested, an adjusted version of Martins's
> pseudo-code formula should do.
> Best I think as follow-up to my series.
> 

Certainly, if possible, please include this special case in the next
version of your series. Appreciate it!

[...]
Heiner Kallweit May 22, 2023, 8:10 p.m. UTC | #19
On 22.05.2023 15:37, Dmitry Rokosov wrote:
> Heiner,
> 
> On Fri, May 19, 2023 at 06:53:30PM +0200, Heiner Kallweit wrote:
>> On 19.05.2023 17:30, Dmitry Rokosov wrote:
>>> Hello Heiner,
>>>
>>> Thank you for the patch series!
>>>
>>> I am currently working on the Amlogic A1 clock driver and other
>>> peripheral devices, including PWM. During a discussion about the clock
>>> driver with Martin Blumenstingl, we found an intersection between the
>>> clock driver and your PWM CCF support patch series. Please see my
>>> comments below.
>>>
>>> On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
>>>> Newer versions of the PWM block use a core clock with external mux,
>>>> divider, and gate. These components either don't exist any longer in
>>>> the PWM block, or they are bypassed.
>>>> To minimize needed changes for supporting the new version, the internal
>>>> divider and gate should be handled by CCF too.
>>>>
>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>> bigger. What it does:
>>>>
>>>> - The internal mux is handled by CCF already. Register also internal
>>>>   divider and gate with CCF, so that we have one representation of the
>>>>   input clock: [mux] parent of [divider] parent of [gate]
>>>>   
>>>> - Now that CCF selects an appropriate mux parent, we don't need the
>>>>   DT-provided default parent any longer. Accordingly we can also omit
>>>>   setting the mux parent directly in the driver.
>>>>   
>>>> - Instead of manually handling the pre-div divider value, let CCF
>>>>   set the input clock. Targeted input clock frequency is
>>>>   0xffff * 1/period for best precision.
>>>>   
>>>> - For the "inverted pwm disabled" scenario target an input clock
>>>>   frequency of 1GHz. This ensures that the remaining low pulses
>>>>   have minimum length.
>>>>
>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>> patch. With the not yet included extension for the new PWM block
>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>> and for the SDIO 32kHz clock.
>>>>
>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>> This seems to indicate that it's not used by the new PWM block.
>>>>
>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>> ---
>>>> Changes to RFT/RFC version:
>>>> - use parent_hws instead of parent_names for div/gate clock
>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>   devm_clk_register isn't needed
>>>>
>>>> v2:
>>>> - add patch 1
>>>> - add patch 3
>>>> - switch to using clk_parent_data in all relevant places
>>>> v3:
>>>> - add flag CLK_IGNORE_UNUSED
>>>> v4:
>>>> - remove variable tmp in meson_pwm_get_state
>>>> - don't use deprecated function devm_clk_register
>>>
>>> [...]
>>>
>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>  	if (state->polarity == PWM_POLARITY_INVERSED)
>>>>  		duty = period - duty;
>>>>  
>>>> -	fin_freq = clk_get_rate(channel->clk);
>>>> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>> +	if (freq > ULONG_MAX)
>>>> +		freq = ULONG_MAX;
>>>> +
>>>> +	fin_freq = clk_round_rate(channel->clk, freq);
>>>>  	if (fin_freq == 0) {
>>>>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>  		return -EINVAL;
>>>
>>> As mentioned previously, we have discussed one optimization for PWM
>>> parent clock calculation. Many modern Amlogic SoCs include an RTC clock
>>> within the clock tree. This clock provides a stable and efficient 32kHz
>>> input for several clock objects that can be inherited through the muxes
>>> from the RTC clock.
>>>
>>> In short, we aim to use the RTC clock parent directly for PWM to
>>> generate a 32kHz clock on the PWM lines. Martin has suggested one way to
>>> do so, which is described in [0]. You can also refer to our IRC
>>> discussion in [1].
>>>
>>> I would appreciate your thoughts on this. Please let me know what you
>>> think.
>>>
>>
>> Requesting a frequency of (NSEC_PER_SEC * 0xffffULL / period) is based
>> on the assumption that the highest possible input frequency always
>> allows to generate a period that is close enough to the requested period.
>>
>> To find the best parent you'd need a somewhat more complex logic outside CCF.
>> What you want is the parent where (f_parent * period / NSEC_PER_SEC) is
>> closest to an integer in the range 1 .. 0xffff.
>> IOW: max(abs((f_parent * period) % 10^9 - 5 * 10^8))
>>
>> This can be done, question is whether it's needed and worth the effort.
>>
>> This would be the generic solution. If you just want to handle the case
>> that period 1/32.768Hz is requested, an adjusted version of Martins's
>> pseudo-code formula should do.
>> Best I think as follow-up to my series.
>>
> 
> Certainly, if possible, please include this special case in the next
> version of your series. Appreciate it!
> 

The currently supported SoC generations don't support the RTC PWM mux
input. Changes for not yet supported SoC generations I'd like to keep
outside the scope of the CCF conversion patch series.
Such a change could be part of a series adding A1 support.
However it's good that that the type of needed change is being discussed
now, better than potentially finding out later that the CCF conversion
is incompatible with what's needed to support newer SoC generations.

For my understanding:
A1 like S4 and SC2 has the PWM clock handling removed from the PWM block?

> [...]
>
Dmitry Rokosov May 23, 2023, 10:28 a.m. UTC | #20
Heiner,

On Mon, May 22, 2023 at 10:10:41PM +0200, Heiner Kallweit wrote:
> On 22.05.2023 15:37, Dmitry Rokosov wrote:
> > Heiner,
> > 
> > On Fri, May 19, 2023 at 06:53:30PM +0200, Heiner Kallweit wrote:
> >> On 19.05.2023 17:30, Dmitry Rokosov wrote:
> >>> Hello Heiner,
> >>>
> >>> Thank you for the patch series!
> >>>
> >>> I am currently working on the Amlogic A1 clock driver and other
> >>> peripheral devices, including PWM. During a discussion about the clock
> >>> driver with Martin Blumenstingl, we found an intersection between the
> >>> clock driver and your PWM CCF support patch series. Please see my
> >>> comments below.
> >>>
> >>> On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
> >>>> Newer versions of the PWM block use a core clock with external mux,
> >>>> divider, and gate. These components either don't exist any longer in
> >>>> the PWM block, or they are bypassed.
> >>>> To minimize needed changes for supporting the new version, the internal
> >>>> divider and gate should be handled by CCF too.
> >>>>
> >>>> I didn't see a good way to split the patch, therefore it's somewhat
> >>>> bigger. What it does:
> >>>>
> >>>> - The internal mux is handled by CCF already. Register also internal
> >>>>   divider and gate with CCF, so that we have one representation of the
> >>>>   input clock: [mux] parent of [divider] parent of [gate]
> >>>>   
> >>>> - Now that CCF selects an appropriate mux parent, we don't need the
> >>>>   DT-provided default parent any longer. Accordingly we can also omit
> >>>>   setting the mux parent directly in the driver.
> >>>>   
> >>>> - Instead of manually handling the pre-div divider value, let CCF
> >>>>   set the input clock. Targeted input clock frequency is
> >>>>   0xffff * 1/period for best precision.
> >>>>   
> >>>> - For the "inverted pwm disabled" scenario target an input clock
> >>>>   frequency of 1GHz. This ensures that the remaining low pulses
> >>>>   have minimum length.
> >>>>
> >>>> I don't have hw with the old PWM block, therefore I couldn't test this
> >>>> patch. With the not yet included extension for the new PWM block
> >>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
> >>>> notice any problem. My system uses PWM for the CPU voltage regulator
> >>>> and for the SDIO 32kHz clock.
> >>>>
> >>>> Note: The clock gate in the old PWM block is permanently disabled.
> >>>> This seems to indicate that it's not used by the new PWM block.
> >>>>
> >>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> >>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> >>>> ---
> >>>> Changes to RFT/RFC version:
> >>>> - use parent_hws instead of parent_names for div/gate clock
> >>>> - use devm_clk_hw_register where the struct clk * returned by
> >>>>   devm_clk_register isn't needed
> >>>>
> >>>> v2:
> >>>> - add patch 1
> >>>> - add patch 3
> >>>> - switch to using clk_parent_data in all relevant places
> >>>> v3:
> >>>> - add flag CLK_IGNORE_UNUSED
> >>>> v4:
> >>>> - remove variable tmp in meson_pwm_get_state
> >>>> - don't use deprecated function devm_clk_register
> >>>
> >>> [...]
> >>>
> >>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
> >>>>  	if (state->polarity == PWM_POLARITY_INVERSED)
> >>>>  		duty = period - duty;
> >>>>  
> >>>> -	fin_freq = clk_get_rate(channel->clk);
> >>>> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
> >>>> +	if (freq > ULONG_MAX)
> >>>> +		freq = ULONG_MAX;
> >>>> +
> >>>> +	fin_freq = clk_round_rate(channel->clk, freq);
> >>>>  	if (fin_freq == 0) {
> >>>>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
> >>>>  		return -EINVAL;
> >>>
> >>> As mentioned previously, we have discussed one optimization for PWM
> >>> parent clock calculation. Many modern Amlogic SoCs include an RTC clock
> >>> within the clock tree. This clock provides a stable and efficient 32kHz
> >>> input for several clock objects that can be inherited through the muxes
> >>> from the RTC clock.
> >>>
> >>> In short, we aim to use the RTC clock parent directly for PWM to
> >>> generate a 32kHz clock on the PWM lines. Martin has suggested one way to
> >>> do so, which is described in [0]. You can also refer to our IRC
> >>> discussion in [1].
> >>>
> >>> I would appreciate your thoughts on this. Please let me know what you
> >>> think.
> >>>
> >>
> >> Requesting a frequency of (NSEC_PER_SEC * 0xffffULL / period) is based
> >> on the assumption that the highest possible input frequency always
> >> allows to generate a period that is close enough to the requested period.
> >>
> >> To find the best parent you'd need a somewhat more complex logic outside CCF.
> >> What you want is the parent where (f_parent * period / NSEC_PER_SEC) is
> >> closest to an integer in the range 1 .. 0xffff.
> >> IOW: max(abs((f_parent * period) % 10^9 - 5 * 10^8))
> >>
> >> This can be done, question is whether it's needed and worth the effort.
> >>
> >> This would be the generic solution. If you just want to handle the case
> >> that period 1/32.768Hz is requested, an adjusted version of Martins's
> >> pseudo-code formula should do.
> >> Best I think as follow-up to my series.
> >>
> > 
> > Certainly, if possible, please include this special case in the next
> > version of your series. Appreciate it!
> > 
> 
> The currently supported SoC generations don't support the RTC PWM mux
> input. Changes for not yet supported SoC generations I'd like to keep
> outside the scope of the CCF conversion patch series.
> Such a change could be part of a series adding A1 support.
> However it's good that that the type of needed change is being discussed
> now, better than potentially finding out later that the CCF conversion
> is incompatible with what's needed to support newer SoC generations.
> 
> For my understanding:
> A1 like S4 and SC2 has the PWM clock handling removed from the PWM block?
> 

Yes, that's correct. PWM clocks are external to A1 and S4 pwm block, and
they are currently located in the Peripherals clock controller. We made
some changes to the PWM driver to support this behavior in the current
version, which is very similar to your 'ext_clk' patchset. However, we
did not send it because of your patch series with fully CCF support.
Heiner Kallweit May 23, 2023, 7:22 p.m. UTC | #21
On 23.05.2023 12:28, Dmitry Rokosov wrote:
> Heiner,
> 
> On Mon, May 22, 2023 at 10:10:41PM +0200, Heiner Kallweit wrote:
>> On 22.05.2023 15:37, Dmitry Rokosov wrote:
>>> Heiner,
>>>
>>> On Fri, May 19, 2023 at 06:53:30PM +0200, Heiner Kallweit wrote:
>>>> On 19.05.2023 17:30, Dmitry Rokosov wrote:
>>>>> Hello Heiner,
>>>>>
>>>>> Thank you for the patch series!
>>>>>
>>>>> I am currently working on the Amlogic A1 clock driver and other
>>>>> peripheral devices, including PWM. During a discussion about the clock
>>>>> driver with Martin Blumenstingl, we found an intersection between the
>>>>> clock driver and your PWM CCF support patch series. Please see my
>>>>> comments below.
>>>>>
>>>>> On Thu, Apr 13, 2023 at 07:54:46AM +0200, Heiner Kallweit wrote:
>>>>>> Newer versions of the PWM block use a core clock with external mux,
>>>>>> divider, and gate. These components either don't exist any longer in
>>>>>> the PWM block, or they are bypassed.
>>>>>> To minimize needed changes for supporting the new version, the internal
>>>>>> divider and gate should be handled by CCF too.
>>>>>>
>>>>>> I didn't see a good way to split the patch, therefore it's somewhat
>>>>>> bigger. What it does:
>>>>>>
>>>>>> - The internal mux is handled by CCF already. Register also internal
>>>>>>   divider and gate with CCF, so that we have one representation of the
>>>>>>   input clock: [mux] parent of [divider] parent of [gate]
>>>>>>   
>>>>>> - Now that CCF selects an appropriate mux parent, we don't need the
>>>>>>   DT-provided default parent any longer. Accordingly we can also omit
>>>>>>   setting the mux parent directly in the driver.
>>>>>>   
>>>>>> - Instead of manually handling the pre-div divider value, let CCF
>>>>>>   set the input clock. Targeted input clock frequency is
>>>>>>   0xffff * 1/period for best precision.
>>>>>>   
>>>>>> - For the "inverted pwm disabled" scenario target an input clock
>>>>>>   frequency of 1GHz. This ensures that the remaining low pulses
>>>>>>   have minimum length.
>>>>>>
>>>>>> I don't have hw with the old PWM block, therefore I couldn't test this
>>>>>> patch. With the not yet included extension for the new PWM block
>>>>>> (channel->clk coming directly from get_clk(external_clk)) I didn't
>>>>>> notice any problem. My system uses PWM for the CPU voltage regulator
>>>>>> and for the SDIO 32kHz clock.
>>>>>>
>>>>>> Note: The clock gate in the old PWM block is permanently disabled.
>>>>>> This seems to indicate that it's not used by the new PWM block.
>>>>>>
>>>>>> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>>>>>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>>>>>> ---
>>>>>> Changes to RFT/RFC version:
>>>>>> - use parent_hws instead of parent_names for div/gate clock
>>>>>> - use devm_clk_hw_register where the struct clk * returned by
>>>>>>   devm_clk_register isn't needed
>>>>>>
>>>>>> v2:
>>>>>> - add patch 1
>>>>>> - add patch 3
>>>>>> - switch to using clk_parent_data in all relevant places
>>>>>> v3:
>>>>>> - add flag CLK_IGNORE_UNUSED
>>>>>> v4:
>>>>>> - remove variable tmp in meson_pwm_get_state
>>>>>> - don't use deprecated function devm_clk_register
>>>>>
>>>>> [...]
>>>>>
>>>>>> @@ -166,7 +158,11 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
>>>>>>  	if (state->polarity == PWM_POLARITY_INVERSED)
>>>>>>  		duty = period - duty;
>>>>>>  
>>>>>> -	fin_freq = clk_get_rate(channel->clk);
>>>>>> +	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
>>>>>> +	if (freq > ULONG_MAX)
>>>>>> +		freq = ULONG_MAX;
>>>>>> +
>>>>>> +	fin_freq = clk_round_rate(channel->clk, freq);
>>>>>>  	if (fin_freq == 0) {
>>>>>>  		dev_err(meson->chip.dev, "invalid source clock frequency\n");
>>>>>>  		return -EINVAL;
>>>>>
>>>>> As mentioned previously, we have discussed one optimization for PWM
>>>>> parent clock calculation. Many modern Amlogic SoCs include an RTC clock
>>>>> within the clock tree. This clock provides a stable and efficient 32kHz
>>>>> input for several clock objects that can be inherited through the muxes
>>>>> from the RTC clock.
>>>>>
>>>>> In short, we aim to use the RTC clock parent directly for PWM to
>>>>> generate a 32kHz clock on the PWM lines. Martin has suggested one way to
>>>>> do so, which is described in [0]. You can also refer to our IRC
>>>>> discussion in [1].
>>>>>
>>>>> I would appreciate your thoughts on this. Please let me know what you
>>>>> think.
>>>>>
>>>>
>>>> Requesting a frequency of (NSEC_PER_SEC * 0xffffULL / period) is based
>>>> on the assumption that the highest possible input frequency always
>>>> allows to generate a period that is close enough to the requested period.
>>>>
>>>> To find the best parent you'd need a somewhat more complex logic outside CCF.
>>>> What you want is the parent where (f_parent * period / NSEC_PER_SEC) is
>>>> closest to an integer in the range 1 .. 0xffff.
>>>> IOW: max(abs((f_parent * period) % 10^9 - 5 * 10^8))
>>>>
>>>> This can be done, question is whether it's needed and worth the effort.
>>>>
>>>> This would be the generic solution. If you just want to handle the case
>>>> that period 1/32.768Hz is requested, an adjusted version of Martins's
>>>> pseudo-code formula should do.
>>>> Best I think as follow-up to my series.
>>>>
>>>
>>> Certainly, if possible, please include this special case in the next
>>> version of your series. Appreciate it!
>>>
>>
>> The currently supported SoC generations don't support the RTC PWM mux
>> input. Changes for not yet supported SoC generations I'd like to keep
>> outside the scope of the CCF conversion patch series.
>> Such a change could be part of a series adding A1 support.
>> However it's good that that the type of needed change is being discussed
>> now, better than potentially finding out later that the CCF conversion
>> is incompatible with what's needed to support newer SoC generations.
>>
>> For my understanding:
>> A1 like S4 and SC2 has the PWM clock handling removed from the PWM block?
>>
> 
> Yes, that's correct. PWM clocks are external to A1 and S4 pwm block, and
> they are currently located in the Peripherals clock controller. We made
> some changes to the PWM driver to support this behavior in the current
> version, which is very similar to your 'ext_clk' patchset. However, we
> did not send it because of your patch series with fully CCF support.
> 

Once CCF is fully supported, the extension needed to support an external
PWM input clock is very small.
Before submitting the (hopefully) final version of the CCF conversion
patch set, few pending patches should have been applied, however this
seems to take some time.
diff mbox series

Patch

diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
index 40a8709ff..80ac71cbc 100644
--- a/drivers/pwm/pwm-meson.c
+++ b/drivers/pwm/pwm-meson.c
@@ -51,7 +51,7 @@ 
 #define REG_MISC_AB		0x8
 #define MISC_B_CLK_EN		23
 #define MISC_A_CLK_EN		15
-#define MISC_CLK_DIV_MASK	0x7f
+#define MISC_CLK_DIV_WIDTH	7
 #define MISC_B_CLK_DIV_SHIFT	16
 #define MISC_A_CLK_DIV_SHIFT	8
 #define MISC_B_CLK_SEL_SHIFT	6
@@ -87,12 +87,13 @@  static struct meson_pwm_channel_data {
 };
 
 struct meson_pwm_channel {
+	unsigned long rate;
 	unsigned int hi;
 	unsigned int lo;
-	u8 pre_div;
 
-	struct clk *clk_parent;
 	struct clk_mux mux;
+	struct clk_divider div;
+	struct clk_gate gate;
 	struct clk *clk;
 };
 
@@ -125,16 +126,6 @@  static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm)
 	struct device *dev = chip->dev;
 	int err;
 
-	if (channel->clk_parent) {
-		err = clk_set_parent(channel->clk, channel->clk_parent);
-		if (err < 0) {
-			dev_err(dev, "failed to set parent %s for %s: %d\n",
-				__clk_get_name(channel->clk_parent),
-				__clk_get_name(channel->clk), err);
-			return err;
-		}
-	}
-
 	err = clk_prepare_enable(channel->clk);
 	if (err < 0) {
 		dev_err(dev, "failed to enable clock %s: %d\n",
@@ -157,8 +148,9 @@  static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
 			  const struct pwm_state *state)
 {
 	struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
-	unsigned int duty, period, pre_div, cnt, duty_cnt;
+	unsigned int duty, period, cnt, duty_cnt;
 	unsigned long fin_freq;
+	u64 freq;
 
 	duty = state->duty_cycle;
 	period = state->period;
@@ -166,7 +158,11 @@  static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
 	if (state->polarity == PWM_POLARITY_INVERSED)
 		duty = period - duty;
 
-	fin_freq = clk_get_rate(channel->clk);
+	freq = div64_u64(NSEC_PER_SEC * (u64)0xffff, period);
+	if (freq > ULONG_MAX)
+		freq = ULONG_MAX;
+
+	fin_freq = clk_round_rate(channel->clk, freq);
 	if (fin_freq == 0) {
 		dev_err(meson->chip.dev, "invalid source clock frequency\n");
 		return -EINVAL;
@@ -174,46 +170,35 @@  static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
 
 	dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
 
-	pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
-	if (pre_div > MISC_CLK_DIV_MASK) {
-		dev_err(meson->chip.dev, "unable to get period pre_div\n");
-		return -EINVAL;
-	}
-
-	cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
+	cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC);
 	if (cnt > 0xffff) {
 		dev_err(meson->chip.dev, "unable to get period cnt\n");
 		return -EINVAL;
 	}
 
-	dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
-		pre_div, cnt);
+	dev_dbg(meson->chip.dev, "period=%u cnt=%u\n", period, cnt);
 
 	if (duty == period) {
-		channel->pre_div = pre_div;
 		channel->hi = cnt;
 		channel->lo = 0;
 	} else if (duty == 0) {
-		channel->pre_div = pre_div;
 		channel->hi = 0;
 		channel->lo = cnt;
 	} else {
-		/* Then check is we can have the duty with the same pre_div */
-		duty_cnt = div64_u64(fin_freq * (u64)duty,
-				     NSEC_PER_SEC * (pre_div + 1));
+		duty_cnt = div64_u64(fin_freq * (u64)duty, NSEC_PER_SEC);
 		if (duty_cnt > 0xffff) {
 			dev_err(meson->chip.dev, "unable to get duty cycle\n");
 			return -EINVAL;
 		}
 
-		dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
-			duty, pre_div, duty_cnt);
+		dev_dbg(meson->chip.dev, "duty=%u duty_cnt=%u\n", duty, duty_cnt);
 
-		channel->pre_div = pre_div;
 		channel->hi = duty_cnt;
 		channel->lo = cnt - duty_cnt;
 	}
 
+	channel->rate = fin_freq;
+
 	return 0;
 }
 
@@ -223,16 +208,15 @@  static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm)
 	struct meson_pwm_channel_data *channel_data;
 	unsigned long flags;
 	u32 value;
+	int err;
 
 	channel_data = &meson_pwm_per_channel_data[pwm->hwpwm];
 
-	spin_lock_irqsave(&meson->lock, flags);
+	err = clk_set_rate(channel->clk, channel->rate);
+	if (err)
+		dev_err(meson->chip.dev, "setting clock rate failed\n");
 
-	value = readl(meson->base + REG_MISC_AB);
-	value &= ~(MISC_CLK_DIV_MASK << channel_data->clk_div_shift);
-	value |= channel->pre_div << channel_data->clk_div_shift;
-	value |= BIT(channel_data->clk_en_bit);
-	writel(value, meson->base + REG_MISC_AB);
+	spin_lock_irqsave(&meson->lock, flags);
 
 	value = FIELD_PREP(PWM_HIGH_MASK, channel->hi) |
 		FIELD_PREP(PWM_LOW_MASK, channel->lo);
@@ -271,16 +255,16 @@  static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 			/*
 			 * This IP block revision doesn't have an "always high"
 			 * setting which we can use for "inverted disabled".
-			 * Instead we achieve this using the same settings
-			 * that we use a pre_div of 0 (to get the shortest
-			 * possible duration for one "count") and
+			 * Instead we achieve this by setting an arbitrary,
+			 * very high frequency, resulting in the shortest
+			 * possible duration for one "count" and
 			 * "period == duty_cycle". This results in a signal
 			 * which is LOW for one "count", while being HIGH for
 			 * the rest of the (so the signal is HIGH for slightly
 			 * less than 100% of the period, but this is the best
 			 * we can achieve).
 			 */
-			channel->pre_div = 0;
+			channel->rate = 1000000000;
 			channel->hi = ~0;
 			channel->lo = 0;
 
@@ -305,7 +289,6 @@  static unsigned int meson_pwm_cnt_to_ns(struct pwm_chip *chip,
 	struct meson_pwm *meson = to_meson_pwm(chip);
 	struct meson_pwm_channel *channel;
 	unsigned long fin_freq;
-	u32 fin_ns;
 
 	/* to_meson_pwm() can only be used after .get_state() is called */
 	channel = &meson->channels[pwm->hwpwm];
@@ -314,9 +297,7 @@  static unsigned int meson_pwm_cnt_to_ns(struct pwm_chip *chip,
 	if (fin_freq == 0)
 		return 0;
 
-	fin_ns = div_u64(NSEC_PER_SEC, fin_freq);
-
-	return cnt * fin_ns * (channel->pre_div + 1);
+	return div_u64(NSEC_PER_SEC * (u64)cnt, fin_freq);
 }
 
 static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
@@ -325,7 +306,7 @@  static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
 	struct meson_pwm *meson = to_meson_pwm(chip);
 	struct meson_pwm_channel_data *channel_data;
 	struct meson_pwm_channel *channel;
-	u32 value, tmp;
+	u32 value;
 
 	if (!state)
 		return 0;
@@ -335,11 +316,8 @@  static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
 
 	value = readl(meson->base + REG_MISC_AB);
 
-	tmp = BIT(channel_data->pwm_en_bit) | BIT(channel_data->clk_en_bit);
-	state->enabled = (value & tmp) == tmp;
-
-	tmp = value >> channel_data->clk_div_shift;
-	channel->pre_div = FIELD_GET(MISC_CLK_DIV_MASK, tmp);
+	state->enabled = __clk_is_enabled(channel->clk) &&
+			 value & BIT(channel_data->pwm_en_bit);
 
 	value = readl(meson->base + channel_data->reg_offset);
 
@@ -480,6 +458,7 @@  static int meson_pwm_init_channels(struct meson_pwm *meson)
 
 	for (i = 0; i < meson->chip.npwm; i++) {
 		struct meson_pwm_channel *channel = &meson->channels[i];
+		struct clk_parent_data div_parent = {}, gate_parent = {};
 		struct clk_init_data init = {};
 
 		snprintf(name, sizeof(name), "%s#mux%u", dev_name(dev), i);
@@ -499,18 +478,63 @@  static int meson_pwm_init_channels(struct meson_pwm *meson)
 		channel->mux.table = NULL;
 		channel->mux.hw.init = &init;
 
-		channel->clk = devm_clk_register(dev, &channel->mux.hw);
-		if (IS_ERR(channel->clk)) {
-			err = PTR_ERR(channel->clk);
+		err = devm_clk_hw_register(dev, &channel->mux.hw);
+		if (err) {
 			dev_err(dev, "failed to register %s: %d\n", name, err);
 			return err;
 		}
 
-		snprintf(name, sizeof(name), "clkin%u", i);
+		snprintf(name, sizeof(name), "%s#div%u", dev_name(dev), i);
+
+		init.name = name;
+		init.ops = &clk_divider_ops;
+		init.flags = CLK_SET_RATE_PARENT;
+		div_parent.index = -1;
+		div_parent.hw = &channel->mux.hw;
+		init.parent_data = &div_parent;
+		init.num_parents = 1;
+
+		channel->div.reg = meson->base + REG_MISC_AB;
+		channel->div.shift = meson_pwm_per_channel_data[i].clk_div_shift;
+		channel->div.width = MISC_CLK_DIV_WIDTH;
+		channel->div.hw.init = &init;
+		channel->div.flags = 0;
+		channel->div.lock = &meson->lock;
+
+		err = devm_clk_hw_register(dev, &channel->div.hw);
+		if (err) {
+			dev_err(dev, "failed to register %s: %d\n", name, err);
+			return err;
+		}
+
+		snprintf(name, sizeof(name), "%s#gate%u", dev_name(dev), i);
+
+		init.name = name;
+		init.ops = &clk_gate_ops;
+		init.flags = CLK_SET_RATE_PARENT | CLK_IGNORE_UNUSED;
+		gate_parent.index = -1;
+		gate_parent.hw = &channel->div.hw;
+		init.parent_data = &gate_parent;
+		init.num_parents = 1;
+
+		channel->gate.reg = meson->base + REG_MISC_AB;
+		channel->gate.bit_idx = meson_pwm_per_channel_data[i].clk_en_bit;
+		channel->gate.hw.init = &init;
+		channel->gate.flags = 0;
+		channel->gate.lock = &meson->lock;
+
+		err = devm_clk_hw_register(dev, &channel->gate.hw);
+		if (err) {
+			dev_err(dev, "failed to register %s: %d\n", name, err);
+			return err;
+		}
 
-		channel->clk_parent = devm_clk_get_optional(dev, name);
-		if (IS_ERR(channel->clk_parent))
-			return PTR_ERR(channel->clk_parent);
+		channel->clk = devm_clk_hw_get_clk(dev, &channel->gate.hw, NULL);
+		if (IS_ERR(channel->clk)) {
+			err = PTR_ERR(channel->clk);
+			dev_err(dev, "failed to register %s: %d\n", name, err);
+			return err;
+		}
 	}
 
 	return 0;