diff mbox series

[v7,2/3] PCI: brcmstb: Configure HW CLKREQ# mode appropriate for downstream device

Message ID 20231109191355.27738-3-james.quinlan@broadcom.com (mailing list archive)
State New, archived
Headers show
Series PCI: brcmstb: Configure appropriate HW CLKREQ# mode | expand

Commit Message

Jim Quinlan Nov. 9, 2023, 7:13 p.m. UTC
The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be
deliberately set by the PCIe RC HW into one of three mutually exclusive
modes:

"safe" -- No CLKREQ# expected or required, refclk is always provided.  This
    mode should work for all devices but is not be capable of any refclk
    power savings.

"no-l1ss" -- CLKREQ# is expected to be driven by the downstream device for
    CPM and ASPM L0s and L1.  Provides Clock Power Management, L0s, and L1,
    but cannot provide L1 substate (L1SS) power savings. If the downstream
    device connected to the RC is L1SS capable AND the OS enables L1SS, all
    PCIe traffic may abruptly halt, potentially hanging the system.

"default" -- Bidirectional CLKREQ# between the RC and downstream device.
    Provides ASPM L0s, L1, and L1SS, but not compliant to provide Clock
    Power Management; specifically, may not be able to meet the Tclron max
    timing of 400ns as specified in "Dynamic Clock Control", section
    3.2.5.2.2 of the PCIe spec.  This situation is atypical and should
    happen only with older devices.

Previously, this driver always set the mode to "no-l1ss", as almost all
STB/CM boards operate in this mode.  But now there is interest in
activating L1SS power savings from STB/CM customers, which requires "aspm"
mode.  In addition, a bug was filed for RPi4 CM platform because most
devices did not work in "no-l1ss" mode.

Note that the mode is specified by the DT property "brcm,clkreq-mode".  If
this property is omitted, then "default" mode is chosen.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276

Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
---
 drivers/pci/controller/pcie-brcmstb.c | 65 ++++++++++++++++++++++-----
 1 file changed, 55 insertions(+), 10 deletions(-)

Comments

Bjorn Helgaas Nov. 9, 2023, 9:27 p.m. UTC | #1
On Thu, Nov 09, 2023 at 02:13:53PM -0500, Jim Quinlan wrote:
> The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be
> deliberately set by the PCIe RC HW into one of three mutually exclusive
> modes:
> 
> "safe" -- No CLKREQ# expected or required, refclk is always provided.  This
>     mode should work for all devices but is not be capable of any refclk
>     power savings.
> 
> "no-l1ss" -- CLKREQ# is expected to be driven by the downstream device for
>     CPM and ASPM L0s and L1.  Provides Clock Power Management, L0s, and L1,
>     but cannot provide L1 substate (L1SS) power savings. If the downstream
>     device connected to the RC is L1SS capable AND the OS enables L1SS, all
>     PCIe traffic may abruptly halt, potentially hanging the system.
> 
> "default" -- Bidirectional CLKREQ# between the RC and downstream device.
>     Provides ASPM L0s, L1, and L1SS, but not compliant to provide Clock
>     Power Management; specifically, may not be able to meet the Tclron max
>     timing of 400ns as specified in "Dynamic Clock Control", section
>     3.2.5.2.2 of the PCIe spec.  This situation is atypical and should
>     happen only with older devices.

The PCIe base spec r6.0 has no section 3.2.5.2.2.  Looks like this
could be:

  PCIe Mini CEM r2.1, sec 3.2.5.2.2 (December, 2016), or
  PCIe CEM r5.1, sec 2.8.2 (August, 2023)

I don't know the relationship between the "Mini CEM" and the "CEM"
specs, but CEM r5.1 seems to have the same text as the Mini CEM r2.1
about Dynamic Clock Control.

We're hampered by the lack of clear subscripts here, but the text in
both capitalizes the "CRL" part, e.g., "T_CLRon".

> Previously, this driver always set the mode to "no-l1ss", as almost all
> STB/CM boards operate in this mode.  But now there is interest in
> activating L1SS power savings from STB/CM customers, which requires "aspm"
> mode.  In addition, a bug was filed for RPi4 CM platform because most
> devices did not work in "no-l1ss" mode.
> 
> Note that the mode is specified by the DT property "brcm,clkreq-mode".  If
> this property is omitted, then "default" mode is chosen.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276
> 
> Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> ---
>  drivers/pci/controller/pcie-brcmstb.c | 65 ++++++++++++++++++++++-----
>  1 file changed, 55 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index f9dd6622fe10..f45c5d0168d3 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -121,9 +121,12 @@
>  
>  #define PCIE_MISC_HARD_PCIE_HARD_DEBUG					0x4204
>  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK	0x2
> +#define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK		0x200000
>  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x08000000
>  #define  PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x00800000
> -
> +#define  PCIE_CLKREQ_MASK \
> +	  (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \
> +	   PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK)
>  
>  #define PCIE_INTR2_CPU_BASE		0x4300
>  #define PCIE_MSI_INTR2_BASE		0x4500
> @@ -1028,13 +1031,61 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
>  	return 0;
>  }
>  
> +static void brcm_config_clkreq(struct brcm_pcie *pcie)
> +{
> +	static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n";
> +	const char *mode = "default";
> +	u32 clkreq_cntl;
> +	int ret;
> +
> +	ret = of_property_read_string(pcie->np, "brcm,clkreq-mode", &mode);
> +	if (ret && ret != -EINVAL) {
> +		dev_err(pcie->dev, err_msg);
> +		mode = "safe";
> +	}
> +
> +	/* Start out assuming safe mode (both mode bits cleared) */
> +	clkreq_cntl = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> +	clkreq_cntl &= ~PCIE_CLKREQ_MASK;
> +
> +	if (strcmp(mode, "no-l1ss") == 0) {
> +		/*
> +		 * "no-l1ss" -- Provides Clock Power Management, L0s, and
> +		 * L1, but cannot provide L1 substate (L1SS) power
> +		 * savings. If the downstream device connected to the RC is
> +		 * L1SS capable AND the OS enables L1SS, all PCIe traffic
> +		 * may abruptly halt, potentially hanging the system.
> +		 */
> +		clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;

Does this somehow change the features advertised by the Root Port,
e.g., does it hide the L1 PM Substates Capability completely, or at
least clear the L1 PM Substates Supported bit?

It it doesn't, the PCI core may enable L1SS and cause this hang.
Every feature advertised in config space is expected to work.

> +	} else if (strcmp(mode, "default") == 0) {
> +		/*
> +		 * "default" -- Provides L0s, L1, and L1SS, but not
> +		 * compliant to provide Clock Power Management;
> +		 * specifically, may not be able to meet the Tclron max
> +		 * timing of 400ns as specified in "Dynamic Clock Control",
> +		 * section 3.2.5.2.2 of the PCIe spec.  This situation is
> +		 * atypical and should happen only with older devices.
> +		 */
> +		clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK;
> +	} else {
> +		/*
> +		 * "safe" -- No power savings; refclk is driven by RC
> +		 * unconditionally.
> +		 */
> +		if (strcmp(mode, "safe") != 0)
> +			dev_err(pcie->dev, err_msg);
> +		mode = "safe";
> +	}
> +	writel(clkreq_cntl, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> +	dev_info(pcie->dev, "clkreq-mode set to %s\n", mode);
> +}
> +
>  static int brcm_pcie_start_link(struct brcm_pcie *pcie)
>  {
>  	struct device *dev = pcie->dev;
>  	void __iomem *base = pcie->base;
>  	u16 nlw, cls, lnksta;
>  	bool ssc_good = false;
> -	u32 tmp;
>  	int ret, i;
>  
>  	/* Unassert the fundamental reset */
> @@ -1059,6 +1110,8 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
>  		return -ENODEV;
>  	}
>  
> +	brcm_config_clkreq(pcie);
> +
>  	if (pcie->gen)
>  		brcm_pcie_set_gen(pcie, pcie->gen);
>  
> @@ -1077,14 +1130,6 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
>  		 pci_speed_string(pcie_link_speed[cls]), nlw,
>  		 ssc_good ? "(SSC)" : "(!SSC)");
>  
> -	/*
> -	 * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1
> -	 * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1.
> -	 */
> -	tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> -	tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
> -	writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> -
>  	return 0;
>  }
Jim Quinlan Nov. 9, 2023, 10:06 p.m. UTC | #2
On Thu, Nov 9, 2023 at 4:27 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Nov 09, 2023 at 02:13:53PM -0500, Jim Quinlan wrote:
> > The Broadcom STB/CM PCIe HW core, which is also used in RPi SOCs, must be
> > deliberately set by the PCIe RC HW into one of three mutually exclusive
> > modes:
> >
> > "safe" -- No CLKREQ# expected or required, refclk is always provided.  This
> >     mode should work for all devices but is not be capable of any refclk
> >     power savings.
> >
> > "no-l1ss" -- CLKREQ# is expected to be driven by the downstream device for
> >     CPM and ASPM L0s and L1.  Provides Clock Power Management, L0s, and L1,
> >     but cannot provide L1 substate (L1SS) power savings. If the downstream
> >     device connected to the RC is L1SS capable AND the OS enables L1SS, all
> >     PCIe traffic may abruptly halt, potentially hanging the system.
> >
> > "default" -- Bidirectional CLKREQ# between the RC and downstream device.
> >     Provides ASPM L0s, L1, and L1SS, but not compliant to provide Clock
> >     Power Management; specifically, may not be able to meet the Tclron max
> >     timing of 400ns as specified in "Dynamic Clock Control", section
> >     3.2.5.2.2 of the PCIe spec.  This situation is atypical and should
> >     happen only with older devices.
>
> The PCIe base spec r6.0 has no section 3.2.5.2.2.  Looks like this
> could be:
>
>   PCIe Mini CEM r2.1, sec 3.2.5.2.2 (December, 2016), or
>   PCIe CEM r5.1, sec 2.8.2 (August, 2023)
>
> I don't know the relationship between the "Mini CEM" and the "CEM"
> specs, but CEM r5.1 seems to have the same text as the Mini CEM r2.1
> about Dynamic Clock Control.
>
> We're hampered by the lack of clear subscripts here, but the text in
> both capitalizes the "CRL" part, e.g., "T_CLRon".
>
> > Previously, this driver always set the mode to "no-l1ss", as almost all
> > STB/CM boards operate in this mode.  But now there is interest in
> > activating L1SS power savings from STB/CM customers, which requires "aspm"
> > mode.  In addition, a bug was filed for RPi4 CM platform because most
> > devices did not work in "no-l1ss" mode.
> >
> > Note that the mode is specified by the DT property "brcm,clkreq-mode".  If
> > this property is omitted, then "default" mode is chosen.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217276
> >
> > Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
> > ---
> >  drivers/pci/controller/pcie-brcmstb.c | 65 ++++++++++++++++++++++-----
> >  1 file changed, 55 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> > index f9dd6622fe10..f45c5d0168d3 100644
> > --- a/drivers/pci/controller/pcie-brcmstb.c
> > +++ b/drivers/pci/controller/pcie-brcmstb.c
> > @@ -121,9 +121,12 @@
> >
> >  #define PCIE_MISC_HARD_PCIE_HARD_DEBUG                                       0x4204
> >  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK     0x2
> > +#define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK             0x200000
> >  #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK             0x08000000
> >  #define  PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK               0x00800000
> > -
> > +#define  PCIE_CLKREQ_MASK \
> > +       (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \
> > +        PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK)
> >
> >  #define PCIE_INTR2_CPU_BASE          0x4300
> >  #define PCIE_MSI_INTR2_BASE          0x4500
> > @@ -1028,13 +1031,61 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
> >       return 0;
> >  }
> >
> > +static void brcm_config_clkreq(struct brcm_pcie *pcie)
> > +{
> > +     static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n";
> > +     const char *mode = "default";
> > +     u32 clkreq_cntl;
> > +     int ret;
> > +
> > +     ret = of_property_read_string(pcie->np, "brcm,clkreq-mode", &mode);
> > +     if (ret && ret != -EINVAL) {
> > +             dev_err(pcie->dev, err_msg);
> > +             mode = "safe";
> > +     }
> > +
> > +     /* Start out assuming safe mode (both mode bits cleared) */
> > +     clkreq_cntl = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > +     clkreq_cntl &= ~PCIE_CLKREQ_MASK;
> > +
> > +     if (strcmp(mode, "no-l1ss") == 0) {
> > +             /*
> > +              * "no-l1ss" -- Provides Clock Power Management, L0s, and
> > +              * L1, but cannot provide L1 substate (L1SS) power
> > +              * savings. If the downstream device connected to the RC is
> > +              * L1SS capable AND the OS enables L1SS, all PCIe traffic
> > +              * may abruptly halt, potentially hanging the system.
> > +              */
> > +             clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
>
> Does this somehow change the features advertised by the Root Port,
> e.g., does it hide the L1 PM Substates Capability completely, or at
> least clear the L1 PM Substates Supported bit?

>
> It it doesn't, the PCI core may enable L1SS and cause this hang.
> Every feature advertised in config space is expected to work.
Agree, I'll put this back in and also reconcile your other comments.

BTW, besides the RPi4, I haven't been able to find a Linux platform
where I can do

        echo $POLICY > /sys/module/pcie_aspm/parameters/policy

It seems that the FW/ACPI typically locks this down.  I did see a
comment somewhere that
said that the reason it was locked down is because too many devices
cannot handle it.
FWIW.

Regards,
Jim Quinlan
Broadcom STB/CM

>
> > +     } else if (strcmp(mode, "default") == 0) {
> > +             /*
> > +              * "default" -- Provides L0s, L1, and L1SS, but not
> > +              * compliant to provide Clock Power Management;
> > +              * specifically, may not be able to meet the Tclron max
> > +              * timing of 400ns as specified in "Dynamic Clock Control",
> > +              * section 3.2.5.2.2 of the PCIe spec.  This situation is
> > +              * atypical and should happen only with older devices.
> > +              */
> > +             clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK;
> > +     } else {
> > +             /*
> > +              * "safe" -- No power savings; refclk is driven by RC
> > +              * unconditionally.
> > +              */
> > +             if (strcmp(mode, "safe") != 0)
> > +                     dev_err(pcie->dev, err_msg);
> > +             mode = "safe";
> > +     }
> > +     writel(clkreq_cntl, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > +     dev_info(pcie->dev, "clkreq-mode set to %s\n", mode);
> > +}
> > +
> >  static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >  {
> >       struct device *dev = pcie->dev;
> >       void __iomem *base = pcie->base;
> >       u16 nlw, cls, lnksta;
> >       bool ssc_good = false;
> > -     u32 tmp;
> >       int ret, i;
> >
> >       /* Unassert the fundamental reset */
> > @@ -1059,6 +1110,8 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >               return -ENODEV;
> >       }
> >
> > +     brcm_config_clkreq(pcie);
> > +
> >       if (pcie->gen)
> >               brcm_pcie_set_gen(pcie, pcie->gen);
> >
> > @@ -1077,14 +1130,6 @@ static int brcm_pcie_start_link(struct brcm_pcie *pcie)
> >                pci_speed_string(pcie_link_speed[cls]), nlw,
> >                ssc_good ? "(SSC)" : "(!SSC)");
> >
> > -     /*
> > -      * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1
> > -      * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1.
> > -      */
> > -     tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > -     tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
> > -     writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
> > -
> >       return 0;
> >  }
Bjorn Helgaas Nov. 9, 2023, 10:31 p.m. UTC | #3
On Thu, Nov 09, 2023 at 05:06:15PM -0500, Jim Quinlan wrote:
> ...

> BTW, besides the RPi4, I haven't been able to find a Linux platform
> where I can do
> 
>         echo $POLICY > /sys/module/pcie_aspm/parameters/policy

This sounds like something we should fix.  What exactly happens?  I
think this should be handled at pcie_aspm_set_policy(), so:

  /sys/module/pcie_aspm/parameters/policy doesn't exist (seems
  unlikely)?

  Returns -EPERM (would indicate aspm_disabled)?

  Returns -EINVAL (would indicate $POLICY doesn't match anything in
  policy_str[])?

  Returns 0 with no action (would indicate $POLICY is the same as the
  current aspm_policy)?

> It seems that the FW/ACPI typically locks this down.  I did see a
> comment somewhere that
> said that the reason it was locked down is because too many devices
> cannot handle it.

Do you have any details about FW/ACPI locking this down?
aspm_disabled is set by the kernel "pcie_aspm=off" parameter (I assume
you're not referring to this), if the FADT has ACPI_FADT_NO_ASPM set,
or if a host bridge's _OSC is missing or failed (maybe [1] is the
comment you saw?)

These all *should* be unusual cases, so I'd be surprised if you're
tripping over one of these.  I would NOT be surprised if we had some
issue in pcie_config_aspm_link() or pcie_set_clkpm() that meant the
policy change didn't work as intended, though.

Bjorn

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v6.6#n617
Jim Quinlan Nov. 10, 2023, 1:01 p.m. UTC | #4
On Thu, Nov 9, 2023 at 5:31 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Thu, Nov 09, 2023 at 05:06:15PM -0500, Jim Quinlan wrote:
> > ...
>
> > BTW, besides the RPi4, I haven't been able to find a Linux platform
> > where I can do
> >
> >         echo $POLICY > /sys/module/pcie_aspm/parameters/policy
>
> This sounds like something we should fix.  What exactly happens?  I
> think this should be handled at pcie_aspm_set_policy(), so:

Well, I've tried changing the ASPM policy on my x86 Ubuntu system and
IIRC a Fedora system.
In both cases it says "illegal write operation" but I am root and the
"policy" file does have
rw perms for root, so I have no idea how it comes back with that
error.  Some machines
allow one to change the setting in the BIOS , FWIW.

Right now on my CM4, "echo $POLICY  > policy" actually works.  Perhaps
when I was testing this I did not yet apply my commits, or perhaps it was with
a specific endpoint device.  Regardless, I'll let you know with a
backtrace if I see
this again.

Regards,
Jim


>
>   /sys/module/pcie_aspm/parameters/policy doesn't exist (seems
>   unlikely)?
>
>   Returns -EPERM (would indicate aspm_disabled)?
>
>   Returns -EINVAL (would indicate $POLICY doesn't match anything in
>   policy_str[])?
>
>   Returns 0 with no action (would indicate $POLICY is the same as the
>   current aspm_policy)?
>
> > It seems that the FW/ACPI typically locks this down.  I did see a
> > comment somewhere that
> > said that the reason it was locked down is because too many devices
> > cannot handle it.
>
> Do you have any details about FW/ACPI locking this down?
> aspm_disabled is set by the kernel "pcie_aspm=off" parameter (I assume
> you're not referring to this), if the FADT has ACPI_FADT_NO_ASPM set,
> or if a host bridge's _OSC is missing or failed (maybe [1] is the
> comment you saw?)
>
> These all *should* be unusual cases, so I'd be surprised if you're
> tripping over one of these.  I would NOT be surprised if we had some
> issue in pcie_config_aspm_link() or pcie_set_clkpm() that meant the
> policy change didn't work as intended, though.
>
> Bjorn
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/acpi/pci_root.c?id=v6.6#n617
>
Bjorn Helgaas Nov. 10, 2023, 2:39 p.m. UTC | #5
On Fri, Nov 10, 2023 at 08:01:23AM -0500, Jim Quinlan wrote:
> On Thu, Nov 9, 2023 at 5:31 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> > On Thu, Nov 09, 2023 at 05:06:15PM -0500, Jim Quinlan wrote:
> > > ...
> >
> > > BTW, besides the RPi4, I haven't been able to find a Linux platform
> > > where I can do
> > >
> > >         echo $POLICY > /sys/module/pcie_aspm/parameters/policy
> >
> > This sounds like something we should fix.  What exactly happens?  I
> > think this should be handled at pcie_aspm_set_policy(), so:
> 
> Well, I've tried changing the ASPM policy on my x86 Ubuntu system
> and IIRC a Fedora system.  In both cases it says "illegal write
> operation" but I am root and the "policy" file does have rw perms
> for root, so I have no idea how it comes back with that error.  Some
> machines allow one to change the setting in the BIOS, FWIW.

BIOS settings like that are potentially misleading unless the BIOS
*also* retains ownership of ASPM or changes the ASPM features
advertised by devices.  If BIOS grants ASPM ownership to the OS, BIOS
should not assume anything about how the OS will configure ASPM.

> Right now on my CM4, "echo $POLICY  > policy" actually works.
> Perhaps when I was testing this I did not yet apply my commits, or
> perhaps it was with a specific endpoint device.  Regardless, I'll
> let you know with a backtrace if I see this again.

Great, thanks!

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
index f9dd6622fe10..f45c5d0168d3 100644
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -121,9 +121,12 @@ 
 
 #define PCIE_MISC_HARD_PCIE_HARD_DEBUG					0x4204
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK	0x2
+#define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK		0x200000
 #define  PCIE_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x08000000
 #define  PCIE_BMIPS_MISC_HARD_PCIE_HARD_DEBUG_SERDES_IDDQ_MASK		0x00800000
-
+#define  PCIE_CLKREQ_MASK \
+	  (PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK | \
+	   PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK)
 
 #define PCIE_INTR2_CPU_BASE		0x4300
 #define PCIE_MSI_INTR2_BASE		0x4500
@@ -1028,13 +1031,61 @@  static int brcm_pcie_setup(struct brcm_pcie *pcie)
 	return 0;
 }
 
+static void brcm_config_clkreq(struct brcm_pcie *pcie)
+{
+	static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n";
+	const char *mode = "default";
+	u32 clkreq_cntl;
+	int ret;
+
+	ret = of_property_read_string(pcie->np, "brcm,clkreq-mode", &mode);
+	if (ret && ret != -EINVAL) {
+		dev_err(pcie->dev, err_msg);
+		mode = "safe";
+	}
+
+	/* Start out assuming safe mode (both mode bits cleared) */
+	clkreq_cntl = readl(pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+	clkreq_cntl &= ~PCIE_CLKREQ_MASK;
+
+	if (strcmp(mode, "no-l1ss") == 0) {
+		/*
+		 * "no-l1ss" -- Provides Clock Power Management, L0s, and
+		 * L1, but cannot provide L1 substate (L1SS) power
+		 * savings. If the downstream device connected to the RC is
+		 * L1SS capable AND the OS enables L1SS, all PCIe traffic
+		 * may abruptly halt, potentially hanging the system.
+		 */
+		clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
+	} else if (strcmp(mode, "default") == 0) {
+		/*
+		 * "default" -- Provides L0s, L1, and L1SS, but not
+		 * compliant to provide Clock Power Management;
+		 * specifically, may not be able to meet the Tclron max
+		 * timing of 400ns as specified in "Dynamic Clock Control",
+		 * section 3.2.5.2.2 of the PCIe spec.  This situation is
+		 * atypical and should happen only with older devices.
+		 */
+		clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK;
+	} else {
+		/*
+		 * "safe" -- No power savings; refclk is driven by RC
+		 * unconditionally.
+		 */
+		if (strcmp(mode, "safe") != 0)
+			dev_err(pcie->dev, err_msg);
+		mode = "safe";
+	}
+	writel(clkreq_cntl, pcie->base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
+	dev_info(pcie->dev, "clkreq-mode set to %s\n", mode);
+}
+
 static int brcm_pcie_start_link(struct brcm_pcie *pcie)
 {
 	struct device *dev = pcie->dev;
 	void __iomem *base = pcie->base;
 	u16 nlw, cls, lnksta;
 	bool ssc_good = false;
-	u32 tmp;
 	int ret, i;
 
 	/* Unassert the fundamental reset */
@@ -1059,6 +1110,8 @@  static int brcm_pcie_start_link(struct brcm_pcie *pcie)
 		return -ENODEV;
 	}
 
+	brcm_config_clkreq(pcie);
+
 	if (pcie->gen)
 		brcm_pcie_set_gen(pcie, pcie->gen);
 
@@ -1077,14 +1130,6 @@  static int brcm_pcie_start_link(struct brcm_pcie *pcie)
 		 pci_speed_string(pcie_link_speed[cls]), nlw,
 		 ssc_good ? "(SSC)" : "(!SSC)");
 
-	/*
-	 * Refclk from RC should be gated with CLKREQ# input when ASPM L0s,L1
-	 * is enabled => setting the CLKREQ_DEBUG_ENABLE field to 1.
-	 */
-	tmp = readl(base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
-	tmp |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_CLKREQ_DEBUG_ENABLE_MASK;
-	writel(tmp, base + PCIE_MISC_HARD_PCIE_HARD_DEBUG);
-
 	return 0;
 }