Message ID | 20240223-opp_support-v7-7-10b4363d7e71@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Manivannan Sadhasivam |
Headers | show |
Series | PCI: qcom: Add support for OPP | expand |
On Fri, Feb 23, 2024 at 08:18:04PM +0530, Krishna chaitanya chundru wrote: > QCOM Resource Power Manager-hardened (RPMh) is a hardware block which > maintains hardware state of a regulator by performing max aggregation of > the requests made by all of the clients. > > PCIe controller can operate on different RPMh performance state of power > domain based up on the speed of the link. And this performance state varies > from target to target. s/up on/on/ (or "upon" if you prefer) (also below) I understand changing the performance state based on the link speed, but I don't understand the variation from target to target. Do you mean just that the link speed may vary based on the rates supported by the downstream device? > It is manadate to scale the performance state based up on the PCIe speed > link operates so that SoC can run under optimum power conditions. It sounds like it's more power efficient, but not actually *mandatory*. Maybe something like this? The SoC can be more power efficient if we scale the performance state based on the aggregate PCIe link speed. > Add Operating Performance Points(OPP) support to vote for RPMh state based > upon the speed link is operating. Space before open paren, e.g., "Points (OPP)". "... based on the link speed." > OPP can handle ICC bw voting also, so move ICC bw voting through OPP > framework if OPP entries are present. > > In PCIe certain speeds like GEN1x2 & GEN2x1 or GEN3x2 & GEN4x1 use > same bw and frequency and thus the OPP entry, so use frequency based > search to reduce number of entries in the OPP table. GEN1x2, GEN2x1, etc are not "speeds". I would say: Different link configurations may share the same aggregate speed, e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link have the same speed and share the same OPP entry. > Don't initialize ICC if OPP is supported. Because? Maybe this should say something about OPP including the ICC voting? > + ret = icc_set_bw(pcie->icc_mem, 0, width * QCOM_PCIE_LINK_SPEED_TO_BW(speed)); Wrap to fit in 80 columns. > + * Use highest OPP here if the OPP table is present. At the end of the probe(), > + * OPP will be updated using qcom_pcie_icc_opp_update(). Wrap to fit in 80 columns. > + /* Skip ICC init if OPP is supported as ICC bw vote is handled by OPP framework */ Wrap to fit in 80 columns.
On Tue, Feb 27, 2024 at 05:36:38PM -0600, Bjorn Helgaas wrote: > On Fri, Feb 23, 2024 at 08:18:04PM +0530, Krishna chaitanya chundru wrote: > > QCOM Resource Power Manager-hardened (RPMh) is a hardware block which > > maintains hardware state of a regulator by performing max aggregation of > > the requests made by all of the clients. > > It is manadate to scale the performance state based up on the PCIe speed > > link operates so that SoC can run under optimum power conditions. > > It sounds like it's more power efficient, but not actually > *mandatory*. Maybe something like this? > > The SoC can be more power efficient if we scale the performance > state based on the aggregate PCIe link speed. Actually, maybe it would be better to say "aggregate PCIe link bandwidth", because we use "speed" elsewhere (PCIE_SPEED2MBS_ENC(), etc) to refer specifically to the data rate independent of the width. > > Add Operating Performance Points(OPP) support to vote for RPMh state based > > upon the speed link is operating. > > "... based on the link speed." "... based on the aggregate link bandwidth." > > In PCIe certain speeds like GEN1x2 & GEN2x1 or GEN3x2 & GEN4x1 use > > same bw and frequency and thus the OPP entry, so use frequency based > > search to reduce number of entries in the OPP table. > > GEN1x2, GEN2x1, etc are not "speeds". I would say: > > Different link configurations may share the same aggregate speed, > e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link have the same speed > and share the same OPP entry. Different link configurations may share the same aggregate bandwidth, e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link have the same bandwidth and share the same OPP entry.
On 2/28/2024 5:15 AM, Bjorn Helgaas wrote: > On Tue, Feb 27, 2024 at 05:36:38PM -0600, Bjorn Helgaas wrote: >> On Fri, Feb 23, 2024 at 08:18:04PM +0530, Krishna chaitanya chundru wrote: >>> QCOM Resource Power Manager-hardened (RPMh) is a hardware block which >>> maintains hardware state of a regulator by performing max aggregation of >>> the requests made by all of the clients. > >>> It is manadate to scale the performance state based up on the PCIe speed >>> link operates so that SoC can run under optimum power conditions. >> >> It sounds like it's more power efficient, but not actually >> *mandatory*. Maybe something like this? >> >> The SoC can be more power efficient if we scale the performance >> state based on the aggregate PCIe link speed. > > Actually, maybe it would be better to say "aggregate PCIe link > bandwidth", because we use "speed" elsewhere (PCIE_SPEED2MBS_ENC(), > etc) to refer specifically to the data rate independent of the width. > >>> Add Operating Performance Points(OPP) support to vote for RPMh state based >>> upon the speed link is operating. >> >> "... based on the link speed." > > "... based on the aggregate link bandwidth." > >>> In PCIe certain speeds like GEN1x2 & GEN2x1 or GEN3x2 & GEN4x1 use >>> same bw and frequency and thus the OPP entry, so use frequency based >>> search to reduce number of entries in the OPP table. >> >> GEN1x2, GEN2x1, etc are not "speeds". I would say: >> >> Different link configurations may share the same aggregate speed, >> e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link have the same speed >> and share the same OPP entry. > > Different link configurations may share the same aggregate > bandwidth, e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link > have the same bandwidth and share the same OPP entry. - I will update the commit message as suggested in my next series. - Krishna Chaitanya.
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 088ebd2e5865..c608bec8b9cb 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -22,6 +22,7 @@ #include <linux/of.h> #include <linux/of_gpio.h> #include <linux/pci.h> +#include <linux/pm_opp.h> #include <linux/pm_runtime.h> #include <linux/platform_device.h> #include <linux/phy/pcie.h> @@ -244,6 +245,7 @@ struct qcom_pcie { const struct qcom_pcie_cfg *cfg; struct dentry *debugfs; bool suspended; + bool opp_supported; }; #define to_qcom_pcie(x) dev_get_drvdata((x)->dev) @@ -1404,16 +1406,14 @@ static int qcom_pcie_icc_init(struct qcom_pcie *pcie) return 0; } -static void qcom_pcie_icc_update(struct qcom_pcie *pcie) +static void qcom_pcie_icc_opp_update(struct qcom_pcie *pcie) { struct dw_pcie *pci = pcie->pci; - u32 offset, status; + u32 offset, status, freq; + struct dev_pm_opp *opp; int speed, width; int ret; - if (!pcie->icc_mem) - return; - offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); status = readw(pci->dbi_base + offset + PCI_EXP_LNKSTA); @@ -1424,11 +1424,26 @@ static void qcom_pcie_icc_update(struct qcom_pcie *pcie) speed = FIELD_GET(PCI_EXP_LNKSTA_CLS, status); width = FIELD_GET(PCI_EXP_LNKSTA_NLW, status); - ret = icc_set_bw(pcie->icc_mem, 0, width * QCOM_PCIE_LINK_SPEED_TO_BW(speed)); - if (ret) { - dev_err(pci->dev, "failed to set interconnect bandwidth: %d\n", - ret); + if (pcie->opp_supported) { + freq = PCIE_MBS2FREQ(pcie_link_speed[speed]); + + opp = dev_pm_opp_find_freq_exact(pci->dev, freq * width, true); + if (!IS_ERR(opp)) { + ret = dev_pm_opp_set_opp(pci->dev, opp); + if (ret) + dev_err(pci->dev, "Failed to set opp: freq %ld ret %d\n", + dev_pm_opp_get_freq(opp), ret); + dev_pm_opp_put(opp); + } + } else { + ret = icc_set_bw(pcie->icc_mem, 0, width * QCOM_PCIE_LINK_SPEED_TO_BW(speed)); + if (ret) { + dev_err(pci->dev, "failed to set interconnect bandwidth for pcie-mem: %d\n", + ret); + } } + + return; } static int qcom_pcie_link_transition_count(struct seq_file *s, void *data) @@ -1471,8 +1486,10 @@ static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie) static int qcom_pcie_probe(struct platform_device *pdev) { const struct qcom_pcie_cfg *pcie_cfg; + unsigned long max_freq = INT_MAX; struct device *dev = &pdev->dev; struct qcom_pcie *pcie; + struct dev_pm_opp *opp; struct dw_pcie_rp *pp; struct resource *res; struct dw_pcie *pci; @@ -1539,9 +1556,36 @@ static int qcom_pcie_probe(struct platform_device *pdev) goto err_pm_runtime_put; } - ret = qcom_pcie_icc_init(pcie); - if (ret) + /* OPP table is optional */ + ret = devm_pm_opp_of_add_table(dev); + if (ret && ret != -ENODEV) { + dev_err_probe(dev, ret, "Failed to add OPP table\n"); goto err_pm_runtime_put; + } + + /* + * Use highest OPP here if the OPP table is present. At the end of the probe(), + * OPP will be updated using qcom_pcie_icc_opp_update(). + */ + if (ret != -ENODEV) { + opp = dev_pm_opp_find_freq_floor(dev, &max_freq); + if (!IS_ERR(opp)) { + ret = dev_pm_opp_set_opp(dev, opp); + if (ret) + dev_err_probe(pci->dev, ret, + "Failed to set opp: freq %ld\n", + dev_pm_opp_get_freq(opp)); + dev_pm_opp_put(opp); + } + pcie->opp_supported = true; + } + + /* Skip ICC init if OPP is supported as ICC bw vote is handled by OPP framework */ + if (!pcie->opp_supported) { + ret = qcom_pcie_icc_init(pcie); + if (ret) + goto err_pm_runtime_put; + } ret = pcie->cfg->ops->get_resources(pcie); if (ret) @@ -1561,7 +1605,7 @@ static int qcom_pcie_probe(struct platform_device *pdev) goto err_phy_exit; } - qcom_pcie_icc_update(pcie); + qcom_pcie_icc_opp_update(pcie); if (pcie->mhi) qcom_pcie_init_debugfs(pcie); @@ -1612,7 +1656,7 @@ static int qcom_pcie_suspend_noirq(struct device *dev) pcie->suspended = true; } - /* Remove cpu path vote after all the register access is done */ + /* Remove CPU path vote after all the register access is done */ ret = icc_disable(pcie->icc_cpu); if (ret) { dev_err(dev, "failed to disable icc path of cpu-pcie: %d\n", ret); @@ -1624,6 +1668,9 @@ static int qcom_pcie_suspend_noirq(struct device *dev) return ret; } + if (pcie->opp_supported) + dev_pm_opp_set_opp(pcie->pci->dev, NULL); + return 0; } @@ -1646,7 +1693,7 @@ static int qcom_pcie_resume_noirq(struct device *dev) pcie->suspended = false; } - qcom_pcie_icc_update(pcie); + qcom_pcie_icc_opp_update(pcie); return 0; }
QCOM Resource Power Manager-hardened (RPMh) is a hardware block which maintains hardware state of a regulator by performing max aggregation of the requests made by all of the clients. PCIe controller can operate on different RPMh performance state of power domain based up on the speed of the link. And this performance state varies from target to target. It is manadate to scale the performance state based up on the PCIe speed link operates so that SoC can run under optimum power conditions. Add Operating Performance Points(OPP) support to vote for RPMh state based upon the speed link is operating. OPP can handle ICC bw voting also, so move ICC bw voting through OPP framework if OPP entries are present. In PCIe certain speeds like GEN1x2 & GEN2x1 or GEN3x2 & GEN4x1 use same bw and frequency and thus the OPP entry, so use frequency based search to reduce number of entries in the OPP table. Don't initialize ICC if OPP is supported. Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> --- drivers/pci/controller/dwc/pcie-qcom.c | 75 +++++++++++++++++++++++++++------- 1 file changed, 61 insertions(+), 14 deletions(-)