Message ID | 20220222163158.1666-6-pali@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PCI: mvebu: Slot support | expand |
On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > This PCIe message is sent automatically by mvebu HW when link changes > status from down to up. I *think* the intent of the above is: If DT supplies the 'slot-power-limit-milliwatt' property, program the value in the Slot Power Limit in the Slot Capabilities register and program the Root Port to send a Set_Slot_Power_Limit Message when the Link transitions to DL_Up. PCIe r6.0, sec 2.2.8.5 and 7.5.3.9, also say Set_Slot_Power_Limit must be sent on a config write to Slot Capabilities. I don't really understand that, since AFAICS, everything in that register is read-only. But there must be some use case for forcing a message. > Slot power limit is read from DT property 'slot-power-limit-milliwatt' and > set to mvebu HW. When this DT property is not specified then driver treat > it as "Slot Capabilities register has not yet been initialized". Does this last sentence mean "the Slot Power Limit is set to 0W (Value = 00h and Scale = 00b = 1.0x) and Auto Slot Power Limit Disable is set, so Set_Slot_Power_Limit Messages will never be sent"? > Signed-off-by: Pali Rohár <pali@kernel.org> > --- > drivers/pci/controller/pci-mvebu.c | 85 ++++++++++++++++++++++++++++-- > 1 file changed, 80 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c > index a75d2b9196f9..c786feec4d17 100644 > --- a/drivers/pci/controller/pci-mvebu.c > +++ b/drivers/pci/controller/pci-mvebu.c > @@ -66,6 +66,12 @@ > #define PCIE_STAT_BUS 0xff00 > #define PCIE_STAT_DEV 0x1f0000 > #define PCIE_STAT_LINK_DOWN BIT(0) > +#define PCIE_SSPL_OFF 0x1a0c > +#define PCIE_SSPL_VALUE_SHIFT 0 > +#define PCIE_SSPL_VALUE_MASK GENMASK(7, 0) > +#define PCIE_SSPL_SCALE_SHIFT 8 > +#define PCIE_SSPL_SCALE_MASK GENMASK(9, 8) > +#define PCIE_SSPL_ENABLE BIT(16) > #define PCIE_RC_RTSTA 0x1a14 > #define PCIE_DEBUG_CTRL 0x1a60 > #define PCIE_DEBUG_SOFT_RESET BIT(20) > @@ -111,6 +117,8 @@ struct mvebu_pcie_port { > struct mvebu_pcie_window iowin; > u32 saved_pcie_stat; > struct resource regs; > + u8 slot_power_limit_value; > + u8 slot_power_limit_scale; > struct irq_domain *intx_irq_domain; > raw_spinlock_t irq_lock; > int intx_irq; > @@ -239,7 +247,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port) > > static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) > { > - u32 ctrl, lnkcap, cmd, dev_rev, unmask; > + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl; > > /* Setup PCIe controller to Root Complex mode. */ > ctrl = mvebu_readl(port, PCIE_CTRL_OFF); > @@ -292,6 +300,20 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) > /* Point PCIe unit MBUS decode windows to DRAM space. */ > mvebu_pcie_setup_wins(port); > > + /* > + * Program Root Complex to automatically sends Set Slot Power Limit > + * PCIe Message when changing status from Dl-Down to Dl-Up and valid > + * slot power limit was specified. s/Root Complex/Root Port/, right? AFAIK the message would be sent by a Downstream Port, i.e., a Root Port in this case. s/sends/send/ s/Set Slot Power Limit/Set_Slot_Power_Limit/ to match spec usage (also below) s/Dl-Down/DL_Down/ to match spec usage s/Dl-Up/DL_Up/ ditto > + */ > + sspl = mvebu_readl(port, PCIE_SSPL_OFF); > + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); > + if (port->slot_power_limit_value && port->slot_power_limit_scale) { > + sspl |= port->slot_power_limit_value << PCIE_SSPL_VALUE_SHIFT; > + sspl |= port->slot_power_limit_scale << PCIE_SSPL_SCALE_SHIFT; > + sspl |= PCIE_SSPL_ENABLE; > + } > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > + > /* Mask all interrupt sources. */ > mvebu_writel(port, ~PCIE_INT_ALL_MASK, PCIE_INT_UNMASK_OFF); > > @@ -628,9 +650,16 @@ mvebu_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge, > (PCI_EXP_LNKSTA_DLLLA << 16) : 0); > break; > > - case PCI_EXP_SLTCTL: > - *value = PCI_EXP_SLTSTA_PDS << 16; > + case PCI_EXP_SLTCTL: { > + u16 slotsta = le16_to_cpu(bridge->pcie_conf.slotsta); > + u32 val = 0; > + if (!(mvebu_readl(port, PCIE_SSPL_OFF) & PCIE_SSPL_ENABLE)) > + val |= PCI_EXP_SLTCTL_ASPL_DISABLE; > + /* PCI_EXP_SLTCTL is 32-bit and contains also slot status bits */ This comment is a little bit confusing because PCI_EXP_SLTCTL is not actually 32 bits; it's 16 bits. It's just that we "read" 32 bits, which includes both PCI_EXP_SLTCTL and PCI_EXP_SLTSTA. If you use "PCI_EXP_SLTCTL", I think it would be helpful to also say "PCI_EXP_SLTSTA" instead of "slot status bits". > + val |= slotsta << 16; > + *value = val; > break; > + } > > case PCI_EXP_RTSTA: > *value = mvebu_readl(port, PCIE_RC_RTSTA); > @@ -774,6 +803,19 @@ mvebu_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge, > mvebu_writel(port, new, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL); > break; > > + case PCI_EXP_SLTCTL: > + if ((mask & PCI_EXP_SLTCTL_ASPL_DISABLE) && > + port->slot_power_limit_value && > + port->slot_power_limit_scale) { > + u32 sspl = mvebu_readl(port, PCIE_SSPL_OFF); > + if (new & PCI_EXP_SLTCTL_ASPL_DISABLE) > + sspl &= ~PCIE_SSPL_ENABLE; > + else > + sspl |= PCIE_SSPL_ENABLE; > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); IIUC, the behavior of PCI_EXP_SLTCTL_ASPL_DISABLE as observed by software that sets it and reads it back will depend on whether the DT contains "slot-power-limit-milliwatt". If there is no DT property, port->slot_power_limit_value will be zero and PCIE_SSPL_ENABLE will never be set. So if I clear PCI_EXP_SLTCTL_ASPL_DISABLE, then read it back, it looks like it will read as being set. That's not what I would expect from the spec (PCIe r6.0, sec 7.5.3.10). > + } > + break; > + > case PCI_EXP_RTSTA: > /* > * PME Status bit in Root Status Register (PCIE_RC_RTSTA) > @@ -868,8 +910,26 @@ static int mvebu_pci_bridge_emul_init(struct mvebu_pcie_port *port) > /* > * Older mvebu hardware provides PCIe Capability structure only in > * version 1. New hardware provides it in version 2. > + * Enable slot support which is emulated. > */ > - bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver); > + bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver | PCI_EXP_FLAGS_SLOT); > + > + /* > + * Set Presence Detect State bit permanently as there is no support for > + * unplugging PCIe card from the slot. Assume that PCIe card is always > + * connected in slot. > + * > + * Set physical slot number to port+1 as mvebu ports are indexed from > + * zero and zero value is reserved for ports within the same silicon > + * as Root Port which is not mvebu case. > + * > + * Also set correct slot power limit. > + */ > + bridge->pcie_conf.slotcap = cpu_to_le32( > + (port->slot_power_limit_value << PCI_EXP_SLTCAP_SPLV_SHIFT) | > + (port->slot_power_limit_scale << PCI_EXP_SLTCAP_SPLS_SHIFT) | > + ((port->port+1) << PCI_EXP_SLTCAP_PSN_SHIFT)); > + bridge->pcie_conf.slotsta = cpu_to_le16(PCI_EXP_SLTSTA_PDS); > > bridge->subsystem_vendor_id = ssdev_id & 0xffff; > bridge->subsystem_id = ssdev_id >> 16; > @@ -1191,6 +1251,7 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, > { > struct device *dev = &pcie->pdev->dev; > enum of_gpio_flags flags; > + u32 slot_power_limit; > int reset_gpio, ret; > u32 num_lanes; > > @@ -1291,6 +1352,15 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, > port->reset_gpio = gpio_to_desc(reset_gpio); > } > > + slot_power_limit = of_pci_get_slot_power_limit(child, > + &port->slot_power_limit_value, > + &port->slot_power_limit_scale); > + if (slot_power_limit) > + dev_info(dev, "%s: Slot power limit %u.%uW\n", > + port->name, > + slot_power_limit / 1000, > + (slot_power_limit / 100) % 10); > + > port->clk = of_clk_get_by_name(child, NULL); > if (IS_ERR(port->clk)) { > dev_err(dev, "%s: cannot get clock\n", port->name); > @@ -1587,7 +1657,7 @@ static int mvebu_pcie_remove(struct platform_device *pdev) > { > struct mvebu_pcie *pcie = platform_get_drvdata(pdev); > struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie); > - u32 cmd; > + u32 cmd, sspl; > int i; > > /* Remove PCI bus with all devices. */ > @@ -1624,6 +1694,11 @@ static int mvebu_pcie_remove(struct platform_device *pdev) > /* Free config space for emulated root bridge. */ > pci_bridge_emul_cleanup(&port->bridge); > > + /* Disable sending Set Slot Power Limit PCIe Message. */ > + sspl = mvebu_readl(port, PCIE_SSPL_OFF); > + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > + > /* Disable and clear BARs and windows. */ > mvebu_pcie_disable_wins(port); > > -- > 2.20.1 >
On Thursday 24 February 2022 15:28:11 Bjorn Helgaas wrote: > On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > > This PCIe message is sent automatically by mvebu HW when link changes > > status from down to up. > > I *think* the intent of the above is: > > If DT supplies the 'slot-power-limit-milliwatt' property, program > the value in the Slot Power Limit in the Slot Capabilities register > and program the Root Port to send a Set_Slot_Power_Limit Message > when the Link transitions to DL_Up. Exactly! > PCIe r6.0, sec 2.2.8.5 and 7.5.3.9, also say Set_Slot_Power_Limit must > be sent on a config write to Slot Capabilities. I don't really > understand that, since AFAICS, everything in that register is > read-only. But there must be some use case for forcing a message. I understood it in this way: Capabilities register is read-only hw-init and so firmware / driver can write initialization values into this register. And when firmware / driver is doing this write then Root port should send that Set_Slot_Power_Limit message. But this is just my interpretation which I thought that makes sense. > > Slot power limit is read from DT property 'slot-power-limit-milliwatt' and > > set to mvebu HW. When this DT property is not specified then driver treat > > it as "Slot Capabilities register has not yet been initialized". > > Does this last sentence mean "the Slot Power Limit is set to 0W > (Value = 00h and Scale = 00b = 1.0x) and Auto Slot Power Limit Disable > is set, so Set_Slot_Power_Limit Messages will never be sent"? Yes! > > Signed-off-by: Pali Rohár <pali@kernel.org> > > --- > > drivers/pci/controller/pci-mvebu.c | 85 ++++++++++++++++++++++++++++-- > > 1 file changed, 80 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c > > index a75d2b9196f9..c786feec4d17 100644 > > --- a/drivers/pci/controller/pci-mvebu.c > > +++ b/drivers/pci/controller/pci-mvebu.c > > @@ -66,6 +66,12 @@ > > #define PCIE_STAT_BUS 0xff00 > > #define PCIE_STAT_DEV 0x1f0000 > > #define PCIE_STAT_LINK_DOWN BIT(0) > > +#define PCIE_SSPL_OFF 0x1a0c > > +#define PCIE_SSPL_VALUE_SHIFT 0 > > +#define PCIE_SSPL_VALUE_MASK GENMASK(7, 0) > > +#define PCIE_SSPL_SCALE_SHIFT 8 > > +#define PCIE_SSPL_SCALE_MASK GENMASK(9, 8) > > +#define PCIE_SSPL_ENABLE BIT(16) > > #define PCIE_RC_RTSTA 0x1a14 > > #define PCIE_DEBUG_CTRL 0x1a60 > > #define PCIE_DEBUG_SOFT_RESET BIT(20) > > @@ -111,6 +117,8 @@ struct mvebu_pcie_port { > > struct mvebu_pcie_window iowin; > > u32 saved_pcie_stat; > > struct resource regs; > > + u8 slot_power_limit_value; > > + u8 slot_power_limit_scale; > > struct irq_domain *intx_irq_domain; > > raw_spinlock_t irq_lock; > > int intx_irq; > > @@ -239,7 +247,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port) > > > > static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) > > { > > - u32 ctrl, lnkcap, cmd, dev_rev, unmask; > > + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl; > > > > /* Setup PCIe controller to Root Complex mode. */ > > ctrl = mvebu_readl(port, PCIE_CTRL_OFF); > > @@ -292,6 +300,20 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) > > /* Point PCIe unit MBUS decode windows to DRAM space. */ > > mvebu_pcie_setup_wins(port); > > > > + /* > > + * Program Root Complex to automatically sends Set Slot Power Limit > > + * PCIe Message when changing status from Dl-Down to Dl-Up and valid > > + * slot power limit was specified. > > s/Root Complex/Root Port/, right? AFAIK the message would be sent by > a Downstream Port, i.e., a Root Port in this case. Yes! I see that on more places that names "Root Port", "Root Bridge" and "Root Complex" used as the one thing. It is probably because HW has only one Root Port and is integrated into same silicon as Root Complex and shares HW registers. And Root Port has PCI class code "PCI Bridge", hence Root Bridge. But I agree that correct name is "Root Port". Moreover in Armada 38x Functional Specification is this register named "Root Complex Set Slot Power Limit" and not Root "Port". > s/sends/send/ > s/Set Slot Power Limit/Set_Slot_Power_Limit/ to match spec usage (also > below) > s/Dl-Down/DL_Down/ to match spec usage > s/Dl-Up/DL_Up/ ditto In Armada 38x Functional Specification spec it is called like I wrote and some people told me to use "naming" as written in SoC/HW specification to not confuse other people who are writing / developing drivers according to official SoC/HW specification. I see that both has pro and cons. Usage of terminology from PCIe spec is what PCIe people expect and terminology from vendor SoC HW spec is what people who develop that SoC expect. I can update and change comments without issue to any variant which you prefer. No problem with it. Just I wanted to write why I chose those names. > > + */ > > + sspl = mvebu_readl(port, PCIE_SSPL_OFF); > > + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); > > + if (port->slot_power_limit_value && port->slot_power_limit_scale) { > > + sspl |= port->slot_power_limit_value << PCIE_SSPL_VALUE_SHIFT; > > + sspl |= port->slot_power_limit_scale << PCIE_SSPL_SCALE_SHIFT; > > + sspl |= PCIE_SSPL_ENABLE; > > + } > > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > > + > > /* Mask all interrupt sources. */ > > mvebu_writel(port, ~PCIE_INT_ALL_MASK, PCIE_INT_UNMASK_OFF); > > > > @@ -628,9 +650,16 @@ mvebu_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge, > > (PCI_EXP_LNKSTA_DLLLA << 16) : 0); > > break; > > > > - case PCI_EXP_SLTCTL: > > - *value = PCI_EXP_SLTSTA_PDS << 16; > > + case PCI_EXP_SLTCTL: { > > + u16 slotsta = le16_to_cpu(bridge->pcie_conf.slotsta); > > + u32 val = 0; > > + if (!(mvebu_readl(port, PCIE_SSPL_OFF) & PCIE_SSPL_ENABLE)) > > + val |= PCI_EXP_SLTCTL_ASPL_DISABLE; > > + /* PCI_EXP_SLTCTL is 32-bit and contains also slot status bits */ > > This comment is a little bit confusing because PCI_EXP_SLTCTL is not > actually 32 bits; it's 16 bits. My comment refers to pci-bridge-emul.c PCI_EXP_SLTCTL register, which is 32-bit. That pci-bridge-emul.c driver has 32-bit registers and in places where PCIe register is only 16-bit, it is concatenated to previous 16-bit. > It's just that we "read" 32 bits, > which includes both PCI_EXP_SLTCTL and PCI_EXP_SLTSTA. If you use > "PCI_EXP_SLTCTL", I think it would be helpful to also say > "PCI_EXP_SLTSTA" instead of "slot status bits". I agree, this is misleading and confusing. And because I'm working with pci-bridge-emul.c pci-aardvark.c and pci-mvebu.c for more than year I probably totally forgot about confusion here as I already catch these things... I will try to change comments to be less confusing. > > + val |= slotsta << 16; > > + *value = val; > > break; > > + } > > > > case PCI_EXP_RTSTA: > > *value = mvebu_readl(port, PCIE_RC_RTSTA); > > @@ -774,6 +803,19 @@ mvebu_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge, > > mvebu_writel(port, new, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL); > > break; > > > > + case PCI_EXP_SLTCTL: > > + if ((mask & PCI_EXP_SLTCTL_ASPL_DISABLE) && > > + port->slot_power_limit_value && > > + port->slot_power_limit_scale) { > > + u32 sspl = mvebu_readl(port, PCIE_SSPL_OFF); > > + if (new & PCI_EXP_SLTCTL_ASPL_DISABLE) > > + sspl &= ~PCIE_SSPL_ENABLE; > > + else > > + sspl |= PCIE_SSPL_ENABLE; > > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > > IIUC, the behavior of PCI_EXP_SLTCTL_ASPL_DISABLE as observed by > software that sets it and reads it back will depend on whether the DT > contains "slot-power-limit-milliwatt". > > If there is no DT property, port->slot_power_limit_value will be zero > and PCIE_SSPL_ENABLE will never be set. So if I clear > PCI_EXP_SLTCTL_ASPL_DISABLE, then read it back, it looks like it will > read as being set. Yes. > That's not what I would expect from the spec (PCIe r6.0, sec 7.5.3.10). Ok. What you would expect here? That PCI_EXP_SLTCTL_ASPL_DISABLE is not set even when Set_Slot_Power_Limit was never sent and would be never sent (as it was not programmed by firmware = in DT)? > > + } > > + break; > > + > > case PCI_EXP_RTSTA: > > /* > > * PME Status bit in Root Status Register (PCIE_RC_RTSTA) > > @@ -868,8 +910,26 @@ static int mvebu_pci_bridge_emul_init(struct mvebu_pcie_port *port) > > /* > > * Older mvebu hardware provides PCIe Capability structure only in > > * version 1. New hardware provides it in version 2. > > + * Enable slot support which is emulated. > > */ > > - bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver); > > + bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver | PCI_EXP_FLAGS_SLOT); > > + > > + /* > > + * Set Presence Detect State bit permanently as there is no support for > > + * unplugging PCIe card from the slot. Assume that PCIe card is always > > + * connected in slot. > > + * > > + * Set physical slot number to port+1 as mvebu ports are indexed from > > + * zero and zero value is reserved for ports within the same silicon > > + * as Root Port which is not mvebu case. > > + * > > + * Also set correct slot power limit. > > + */ > > + bridge->pcie_conf.slotcap = cpu_to_le32( > > + (port->slot_power_limit_value << PCI_EXP_SLTCAP_SPLV_SHIFT) | > > + (port->slot_power_limit_scale << PCI_EXP_SLTCAP_SPLS_SHIFT) | > > + ((port->port+1) << PCI_EXP_SLTCAP_PSN_SHIFT)); > > + bridge->pcie_conf.slotsta = cpu_to_le16(PCI_EXP_SLTSTA_PDS); > > > > bridge->subsystem_vendor_id = ssdev_id & 0xffff; > > bridge->subsystem_id = ssdev_id >> 16; > > @@ -1191,6 +1251,7 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, > > { > > struct device *dev = &pcie->pdev->dev; > > enum of_gpio_flags flags; > > + u32 slot_power_limit; > > int reset_gpio, ret; > > u32 num_lanes; > > > > @@ -1291,6 +1352,15 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, > > port->reset_gpio = gpio_to_desc(reset_gpio); > > } > > > > + slot_power_limit = of_pci_get_slot_power_limit(child, > > + &port->slot_power_limit_value, > > + &port->slot_power_limit_scale); > > + if (slot_power_limit) > > + dev_info(dev, "%s: Slot power limit %u.%uW\n", > > + port->name, > > + slot_power_limit / 1000, > > + (slot_power_limit / 100) % 10); > > + > > port->clk = of_clk_get_by_name(child, NULL); > > if (IS_ERR(port->clk)) { > > dev_err(dev, "%s: cannot get clock\n", port->name); > > @@ -1587,7 +1657,7 @@ static int mvebu_pcie_remove(struct platform_device *pdev) > > { > > struct mvebu_pcie *pcie = platform_get_drvdata(pdev); > > struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie); > > - u32 cmd; > > + u32 cmd, sspl; > > int i; > > > > /* Remove PCI bus with all devices. */ > > @@ -1624,6 +1694,11 @@ static int mvebu_pcie_remove(struct platform_device *pdev) > > /* Free config space for emulated root bridge. */ > > pci_bridge_emul_cleanup(&port->bridge); > > > > + /* Disable sending Set Slot Power Limit PCIe Message. */ > > + sspl = mvebu_readl(port, PCIE_SSPL_OFF); > > + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); > > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > > + > > /* Disable and clear BARs and windows. */ > > mvebu_pcie_disable_wins(port); > > > > -- > > 2.20.1 > >
On Fri, Feb 25, 2022 at 01:54:07PM +0100, Pali Rohár wrote: > On Thursday 24 February 2022 15:28:11 Bjorn Helgaas wrote: > > On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > > > This PCIe message is sent automatically by mvebu HW when link changes > > > status from down to up. > > > + * Program Root Complex to automatically sends Set Slot Power Limit > > > + * PCIe Message when changing status from Dl-Down to Dl-Up and valid > > > + * slot power limit was specified. > > > > s/Root Complex/Root Port/, right? AFAIK the message would be sent by > > a Downstream Port, i.e., a Root Port in this case. > > Yes! > > I see that on more places that names "Root Port", "Root Bridge" and > "Root Complex" used as the one thing. > > It is probably because HW has only one Root Port and is integrated into > same silicon as Root Complex and shares HW registers. And Root Port has > PCI class code "PCI Bridge", hence Root Bridge. > > But I agree that correct name is "Root Port". > > Moreover in Armada 38x Functional Specification is this register named > "Root Complex Set Slot Power Limit" and not Root "Port". Haha, yes, sounds like this stems from the knowledge that "of course this Root Complex only has one Root Port, so there's no real difference between them." So I think it makes sense for #defines for device-specific registers like PCIE_SSPL_OFF to match the Armada spec, but I think it would be better if the comments and code structure did not have the assumption that there's only one Root Port baked into them. For one thing, this can help make the driver structure more uniform across all the drivers. > > s/sends/send/ > > s/Set Slot Power Limit/Set_Slot_Power_Limit/ to match spec usage (also > > below) > > s/Dl-Down/DL_Down/ to match spec usage > > s/Dl-Up/DL_Up/ ditto > > In Armada 38x Functional Specification spec it is called like I wrote > and some people told me to use "naming" as written in SoC/HW > specification to not confuse other people who are writing / developing > drivers according to official SoC/HW specification. > > I see that both has pro and cons. Usage of terminology from PCIe spec is > what PCIe people expect and terminology from vendor SoC HW spec is what > people who develop that SoC expect. > > I can update and change comments without issue to any variant which you > prefer. No problem with it. Just I wanted to write why I chose those > names. All these concepts are purely PCIe. There is no Armada-specific meaning because they have to behave as specified by the PCIe spec so they work across the Link with non-Armada devices on the other end. If the Armada spec spells them differently, I claim that's an editing mistake in that spec. My preference is to use the PCIe spec naming except for Armada-specific things. The Armada spellings don't appear in the PCIe spec, so it's just an unnecessary irritant when trying to look them up. > > > + case PCI_EXP_SLTCTL: > > > + if ((mask & PCI_EXP_SLTCTL_ASPL_DISABLE) && > > > + port->slot_power_limit_value && > > > + port->slot_power_limit_scale) { > > > + u32 sspl = mvebu_readl(port, PCIE_SSPL_OFF); > > > + if (new & PCI_EXP_SLTCTL_ASPL_DISABLE) > > > + sspl &= ~PCIE_SSPL_ENABLE; > > > + else > > > + sspl |= PCIE_SSPL_ENABLE; > > > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > > > > IIUC, the behavior of PCI_EXP_SLTCTL_ASPL_DISABLE as observed by > > software that sets it and reads it back will depend on whether the DT > > contains "slot-power-limit-milliwatt". > > > > If there is no DT property, port->slot_power_limit_value will be zero > > and PCIE_SSPL_ENABLE will never be set. So if I clear > > PCI_EXP_SLTCTL_ASPL_DISABLE, then read it back, it looks like it will > > read as being set. > > Yes. > > > That's not what I would expect from the spec (PCIe r6.0, sec 7.5.3.10). > > Ok. What you would expect here? That PCI_EXP_SLTCTL_ASPL_DISABLE is not > set even when Set_Slot_Power_Limit was never sent and would be never > sent (as it was not programmed by firmware = in DT)? Per spec, I would expect PCI_EXP_SLTCTL_ASPL_DISABLE to be either: - Hardwired to 0, so writes have no effect and reads always return 0, or - Writable, so a read always returns what was previously written. Here's the relevant text from r6.0, sec 7.5.3.10: Auto Slot Power Limit Disable - When Set, this disables the automatic sending of a Set_Slot_Power_Limit Message when a Link transitions from a non-DL_Up status to a DL_Up status. Downstream ports that don’t support DPC are permitted to hardwire this bit to 0. Default value of this bit is implementation specific. AFAICT, the Slot Power Control mechanism is required for all devices, but without "slot-power-limit-milliwatt", we don't know what limit to use. Apparently the CEM specs specify minimum values, but they depend on the form factor. From r6.0, sec 6.9: For Adapters: - Until and unless a Set_Slot_Power_Limit Message is received indicating a Slot Power Limit value greater than the lowest value specified in the form factor specification for the adapter's form factor, the adapter must not consume more than the lowest value specified. - An adapter must never consume more power than what was specified in the most recently received Set_Slot_Power_Limit Message or the minimum value specified in the corresponding form factor specification, whichever is higher. If PCIE_SSPL_ENABLE is never set, Set_Slot_Power_Limit will never be sent, and the device is not allowed to consume more than the minimum power specified by its form factor spec. I'd say reading PCI_EXP_SLTCTL should return whatever PCI_EXP_SLTCTL_ASPL_DISABLE value was most recently written, but we should set PCIE_SSPL_ENABLE only when we have a "slot-power-limit-milliwatt" property AND PCI_EXP_SLTCTL_ASPL_DISABLE == 0. Does that make sense? Bjorn
On Fri, Feb 25, 2022 at 01:54:07PM +0100, Pali Rohár wrote: > On Thursday 24 February 2022 15:28:11 Bjorn Helgaas wrote: > > On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > > > This PCIe message is sent automatically by mvebu HW when link changes > > > status from down to up. > > PCIe r6.0, sec 2.2.8.5 and 7.5.3.9, also say Set_Slot_Power_Limit must > > be sent on a config write to Slot Capabilities. I don't really > > understand that, since AFAICS, everything in that register is > > read-only. But there must be some use case for forcing a message. > > I understood it in this way: Capabilities register is read-only hw-init > and so firmware / driver can write initialization values into this > register. And when firmware / driver is doing this write then Root port > should send that Set_Slot_Power_Limit message. Sec 7.5.3.9 describes the behavior of Slot Capabilities in config space, where it must be read-only. Firmware (or the mvebu driver) must use a different mechanism to initialize the values. FWIW, I found this implementation note in PCIe r6.0, sec 6.9 that explains why config writes to this read-only register would be useful: IMPLEMENTATION NOTE: AUTO SLOT POWER LIMIT DISABLE In some environments host software may wish to directly manage the transmission of a Set_Slot_Power_Limit message by performing a Configuration Write to the Slot Capabilities register rather than have the transmission automatically occur when the Link transitions from a non-DL_Up to a DL_Up status. This allows host software to limit power supply surge current by staggering the transition of Endpoints to a higher power state following a Link Down or when multiple Endpoints are simultaneously hot-added due to cable or adapter insertion. Bjorn
On Friday 25 February 2022 10:57:31 Bjorn Helgaas wrote: > On Fri, Feb 25, 2022 at 01:54:07PM +0100, Pali Rohár wrote: > > On Thursday 24 February 2022 15:28:11 Bjorn Helgaas wrote: > > > On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > > > > This PCIe message is sent automatically by mvebu HW when link changes > > > > status from down to up. > > > > > + * Program Root Complex to automatically sends Set Slot Power Limit > > > > + * PCIe Message when changing status from Dl-Down to Dl-Up and valid > > > > + * slot power limit was specified. > > > > > > s/Root Complex/Root Port/, right? AFAIK the message would be sent by > > > a Downstream Port, i.e., a Root Port in this case. > > > > Yes! > > > > I see that on more places that names "Root Port", "Root Bridge" and > > "Root Complex" used as the one thing. > > > > It is probably because HW has only one Root Port and is integrated into > > same silicon as Root Complex and shares HW registers. And Root Port has > > PCI class code "PCI Bridge", hence Root Bridge. > > > > But I agree that correct name is "Root Port". > > > > Moreover in Armada 38x Functional Specification is this register named > > "Root Complex Set Slot Power Limit" and not Root "Port". > > Haha, yes, sounds like this stems from the knowledge that "of course > this Root Complex only has one Root Port, so there's no real > difference between them." > > So I think it makes sense for #defines for device-specific registers > like PCIE_SSPL_OFF to match the Armada spec, but I think it would be > better if the comments and code structure did not have the assumption > that there's only one Root Port baked into them. For one thing, this > can help make the driver structure more uniform across all the > drivers. Ok! > > > s/sends/send/ > > > s/Set Slot Power Limit/Set_Slot_Power_Limit/ to match spec usage (also > > > below) > > > s/Dl-Down/DL_Down/ to match spec usage > > > s/Dl-Up/DL_Up/ ditto > > > > In Armada 38x Functional Specification spec it is called like I wrote > > and some people told me to use "naming" as written in SoC/HW > > specification to not confuse other people who are writing / developing > > drivers according to official SoC/HW specification. > > > > I see that both has pro and cons. Usage of terminology from PCIe spec is > > what PCIe people expect and terminology from vendor SoC HW spec is what > > people who develop that SoC expect. > > > > I can update and change comments without issue to any variant which you > > prefer. No problem with it. Just I wanted to write why I chose those > > names. > > All these concepts are purely PCIe. There is no Armada-specific > meaning because they have to behave as specified by the PCIe spec so > they work across the Link with non-Armada devices on the other end. > If the Armada spec spells them differently, I claim that's an editing > mistake in that spec. > > My preference is to use the PCIe spec naming except for > Armada-specific things. The Armada spellings don't appear in the PCIe > spec, so it's just an unnecessary irritant when trying to look them > up. Ok! > > > > + case PCI_EXP_SLTCTL: > > > > + if ((mask & PCI_EXP_SLTCTL_ASPL_DISABLE) && > > > > + port->slot_power_limit_value && > > > > + port->slot_power_limit_scale) { > > > > + u32 sspl = mvebu_readl(port, PCIE_SSPL_OFF); > > > > + if (new & PCI_EXP_SLTCTL_ASPL_DISABLE) > > > > + sspl &= ~PCIE_SSPL_ENABLE; > > > > + else > > > > + sspl |= PCIE_SSPL_ENABLE; > > > > + mvebu_writel(port, sspl, PCIE_SSPL_OFF); > > > > > > IIUC, the behavior of PCI_EXP_SLTCTL_ASPL_DISABLE as observed by > > > software that sets it and reads it back will depend on whether the DT > > > contains "slot-power-limit-milliwatt". > > > > > > If there is no DT property, port->slot_power_limit_value will be zero > > > and PCIE_SSPL_ENABLE will never be set. So if I clear > > > PCI_EXP_SLTCTL_ASPL_DISABLE, then read it back, it looks like it will > > > read as being set. > > > > Yes. > > > > > That's not what I would expect from the spec (PCIe r6.0, sec 7.5.3.10). > > > > Ok. What you would expect here? That PCI_EXP_SLTCTL_ASPL_DISABLE is not > > set even when Set_Slot_Power_Limit was never sent and would be never > > sent (as it was not programmed by firmware = in DT)? > > Per spec, I would expect PCI_EXP_SLTCTL_ASPL_DISABLE to be either: > > - Hardwired to 0, so writes have no effect and reads always return > 0, or > > - Writable, so a read always returns what was previously written. > > Here's the relevant text from r6.0, sec 7.5.3.10: > > Auto Slot Power Limit Disable - When Set, this disables the > automatic sending of a Set_Slot_Power_Limit Message when a Link > transitions from a non-DL_Up status to a DL_Up status. > > Downstream ports that don’t support DPC are permitted to hardwire > this bit to 0. > > Default value of this bit is implementation specific. > > AFAICT, the Slot Power Control mechanism is required for all devices, > but without "slot-power-limit-milliwatt", we don't know what limit to > use. Apparently the CEM specs specify minimum values, but they depend > on the form factor. > > From r6.0, sec 6.9: > > For Adapters: > > - Until and unless a Set_Slot_Power_Limit Message is received > indicating a Slot Power Limit value greater than the lowest > value specified in the form factor specification for the > adapter's form factor, the adapter must not consume more than > the lowest value specified. > > - An adapter must never consume more power than what was specified > in the most recently received Set_Slot_Power_Limit Message or > the minimum value specified in the corresponding form factor > specification, whichever is higher. > > If PCIE_SSPL_ENABLE is never set, Set_Slot_Power_Limit will never be > sent, and the device is not allowed to consume more than the minimum > power specified by its form factor spec. > > I'd say reading PCI_EXP_SLTCTL should return whatever > PCI_EXP_SLTCTL_ASPL_DISABLE value was most recently written, but we > should set PCIE_SSPL_ENABLE only when we have a > "slot-power-limit-milliwatt" property AND > PCI_EXP_SLTCTL_ASPL_DISABLE == 0. > > Does that make sense? Yes!
On Friday 25 February 2022 11:02:25 Bjorn Helgaas wrote: > On Fri, Feb 25, 2022 at 01:54:07PM +0100, Pali Rohár wrote: > > On Thursday 24 February 2022 15:28:11 Bjorn Helgaas wrote: > > > On Tue, Feb 22, 2022 at 05:31:57PM +0100, Pali Rohár wrote: > > > > This PCIe message is sent automatically by mvebu HW when link changes > > > > status from down to up. > > > > PCIe r6.0, sec 2.2.8.5 and 7.5.3.9, also say Set_Slot_Power_Limit must > > > be sent on a config write to Slot Capabilities. I don't really > > > understand that, since AFAICS, everything in that register is > > > read-only. But there must be some use case for forcing a message. > > > > I understood it in this way: Capabilities register is read-only hw-init > > and so firmware / driver can write initialization values into this > > register. And when firmware / driver is doing this write then Root port > > should send that Set_Slot_Power_Limit message. > > Sec 7.5.3.9 describes the behavior of Slot Capabilities in config > space, where it must be read-only. Firmware (or the mvebu driver) > must use a different mechanism to initialize the values. > > FWIW, I found this implementation note in PCIe r6.0, sec 6.9 that > explains why config writes to this read-only register would be useful: > > IMPLEMENTATION NOTE: AUTO SLOT POWER LIMIT DISABLE > > In some environments host software may wish to directly manage the > transmission of a Set_Slot_Power_Limit message by performing a > Configuration Write to the Slot Capabilities register rather than > have the transmission automatically occur when the Link transitions > from a non-DL_Up to a DL_Up status. This allows host software to > limit power supply surge current by staggering the transition of > Endpoints to a higher power state following a Link Down or when > multiple Endpoints are simultaneously hot-added due to cable or > adapter insertion. > > Bjorn Hm... I did not understand from this description what should happen when write operation is performed to the slot capabilities register. It is allowed to change content of this register? Or better question, what should write hook in pci-mvebu.c driver for slot capabilities do?
diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c index a75d2b9196f9..c786feec4d17 100644 --- a/drivers/pci/controller/pci-mvebu.c +++ b/drivers/pci/controller/pci-mvebu.c @@ -66,6 +66,12 @@ #define PCIE_STAT_BUS 0xff00 #define PCIE_STAT_DEV 0x1f0000 #define PCIE_STAT_LINK_DOWN BIT(0) +#define PCIE_SSPL_OFF 0x1a0c +#define PCIE_SSPL_VALUE_SHIFT 0 +#define PCIE_SSPL_VALUE_MASK GENMASK(7, 0) +#define PCIE_SSPL_SCALE_SHIFT 8 +#define PCIE_SSPL_SCALE_MASK GENMASK(9, 8) +#define PCIE_SSPL_ENABLE BIT(16) #define PCIE_RC_RTSTA 0x1a14 #define PCIE_DEBUG_CTRL 0x1a60 #define PCIE_DEBUG_SOFT_RESET BIT(20) @@ -111,6 +117,8 @@ struct mvebu_pcie_port { struct mvebu_pcie_window iowin; u32 saved_pcie_stat; struct resource regs; + u8 slot_power_limit_value; + u8 slot_power_limit_scale; struct irq_domain *intx_irq_domain; raw_spinlock_t irq_lock; int intx_irq; @@ -239,7 +247,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port) static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) { - u32 ctrl, lnkcap, cmd, dev_rev, unmask; + u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl; /* Setup PCIe controller to Root Complex mode. */ ctrl = mvebu_readl(port, PCIE_CTRL_OFF); @@ -292,6 +300,20 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port) /* Point PCIe unit MBUS decode windows to DRAM space. */ mvebu_pcie_setup_wins(port); + /* + * Program Root Complex to automatically sends Set Slot Power Limit + * PCIe Message when changing status from Dl-Down to Dl-Up and valid + * slot power limit was specified. + */ + sspl = mvebu_readl(port, PCIE_SSPL_OFF); + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); + if (port->slot_power_limit_value && port->slot_power_limit_scale) { + sspl |= port->slot_power_limit_value << PCIE_SSPL_VALUE_SHIFT; + sspl |= port->slot_power_limit_scale << PCIE_SSPL_SCALE_SHIFT; + sspl |= PCIE_SSPL_ENABLE; + } + mvebu_writel(port, sspl, PCIE_SSPL_OFF); + /* Mask all interrupt sources. */ mvebu_writel(port, ~PCIE_INT_ALL_MASK, PCIE_INT_UNMASK_OFF); @@ -628,9 +650,16 @@ mvebu_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge, (PCI_EXP_LNKSTA_DLLLA << 16) : 0); break; - case PCI_EXP_SLTCTL: - *value = PCI_EXP_SLTSTA_PDS << 16; + case PCI_EXP_SLTCTL: { + u16 slotsta = le16_to_cpu(bridge->pcie_conf.slotsta); + u32 val = 0; + if (!(mvebu_readl(port, PCIE_SSPL_OFF) & PCIE_SSPL_ENABLE)) + val |= PCI_EXP_SLTCTL_ASPL_DISABLE; + /* PCI_EXP_SLTCTL is 32-bit and contains also slot status bits */ + val |= slotsta << 16; + *value = val; break; + } case PCI_EXP_RTSTA: *value = mvebu_readl(port, PCIE_RC_RTSTA); @@ -774,6 +803,19 @@ mvebu_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge, mvebu_writel(port, new, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL); break; + case PCI_EXP_SLTCTL: + if ((mask & PCI_EXP_SLTCTL_ASPL_DISABLE) && + port->slot_power_limit_value && + port->slot_power_limit_scale) { + u32 sspl = mvebu_readl(port, PCIE_SSPL_OFF); + if (new & PCI_EXP_SLTCTL_ASPL_DISABLE) + sspl &= ~PCIE_SSPL_ENABLE; + else + sspl |= PCIE_SSPL_ENABLE; + mvebu_writel(port, sspl, PCIE_SSPL_OFF); + } + break; + case PCI_EXP_RTSTA: /* * PME Status bit in Root Status Register (PCIE_RC_RTSTA) @@ -868,8 +910,26 @@ static int mvebu_pci_bridge_emul_init(struct mvebu_pcie_port *port) /* * Older mvebu hardware provides PCIe Capability structure only in * version 1. New hardware provides it in version 2. + * Enable slot support which is emulated. */ - bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver); + bridge->pcie_conf.cap = cpu_to_le16(pcie_cap_ver | PCI_EXP_FLAGS_SLOT); + + /* + * Set Presence Detect State bit permanently as there is no support for + * unplugging PCIe card from the slot. Assume that PCIe card is always + * connected in slot. + * + * Set physical slot number to port+1 as mvebu ports are indexed from + * zero and zero value is reserved for ports within the same silicon + * as Root Port which is not mvebu case. + * + * Also set correct slot power limit. + */ + bridge->pcie_conf.slotcap = cpu_to_le32( + (port->slot_power_limit_value << PCI_EXP_SLTCAP_SPLV_SHIFT) | + (port->slot_power_limit_scale << PCI_EXP_SLTCAP_SPLS_SHIFT) | + ((port->port+1) << PCI_EXP_SLTCAP_PSN_SHIFT)); + bridge->pcie_conf.slotsta = cpu_to_le16(PCI_EXP_SLTSTA_PDS); bridge->subsystem_vendor_id = ssdev_id & 0xffff; bridge->subsystem_id = ssdev_id >> 16; @@ -1191,6 +1251,7 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, { struct device *dev = &pcie->pdev->dev; enum of_gpio_flags flags; + u32 slot_power_limit; int reset_gpio, ret; u32 num_lanes; @@ -1291,6 +1352,15 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, port->reset_gpio = gpio_to_desc(reset_gpio); } + slot_power_limit = of_pci_get_slot_power_limit(child, + &port->slot_power_limit_value, + &port->slot_power_limit_scale); + if (slot_power_limit) + dev_info(dev, "%s: Slot power limit %u.%uW\n", + port->name, + slot_power_limit / 1000, + (slot_power_limit / 100) % 10); + port->clk = of_clk_get_by_name(child, NULL); if (IS_ERR(port->clk)) { dev_err(dev, "%s: cannot get clock\n", port->name); @@ -1587,7 +1657,7 @@ static int mvebu_pcie_remove(struct platform_device *pdev) { struct mvebu_pcie *pcie = platform_get_drvdata(pdev); struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie); - u32 cmd; + u32 cmd, sspl; int i; /* Remove PCI bus with all devices. */ @@ -1624,6 +1694,11 @@ static int mvebu_pcie_remove(struct platform_device *pdev) /* Free config space for emulated root bridge. */ pci_bridge_emul_cleanup(&port->bridge); + /* Disable sending Set Slot Power Limit PCIe Message. */ + sspl = mvebu_readl(port, PCIE_SSPL_OFF); + sspl &= ~(PCIE_SSPL_VALUE_MASK | PCIE_SSPL_SCALE_MASK | PCIE_SSPL_ENABLE); + mvebu_writel(port, sspl, PCIE_SSPL_OFF); + /* Disable and clear BARs and windows. */ mvebu_pcie_disable_wins(port);
This PCIe message is sent automatically by mvebu HW when link changes status from down to up. Slot power limit is read from DT property 'slot-power-limit-milliwatt' and set to mvebu HW. When this DT property is not specified then driver treat it as "Slot Capabilities register has not yet been initialized". Signed-off-by: Pali Rohár <pali@kernel.org> --- drivers/pci/controller/pci-mvebu.c | 85 ++++++++++++++++++++++++++++-- 1 file changed, 80 insertions(+), 5 deletions(-)