Message ID | 1499955692-26556-4-git-send-email-dingtianhong@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Thu, Jul 13, 2017 at 7:21 AM, Ding Tianhong <dingtianhong@huawei.com> wrote: > From: Casey Leedom <leedom@chelsio.com> > > cxgb4 Ethernet driver now queries PCIe configuration space to determine > if it can send TLPs to it with the Relaxed Ordering Attribute set. > > Remove the enable_pcie_relaxed_ordering() to avoid enable PCIe Capability > Device Control[Relaxed Ordering Enable] at probe routine, to make sure > the driver will not send the Relaxed Ordering TLPs to the Root Complex which > could not deal the Relaxed Ordering TLPs. > > Signed-off-by: Casey Leedom <leedom@chelsio.com> > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Ding, You can probably just drop this patch. If I am understanding Casey correctly just the fact that the relaxed ordering enable bit is cleared in the configuration should be enough to do this for the device automatically. - Alex > --- > drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 1 + > drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 23 +++++++++++++++++------ > drivers/net/ethernet/chelsio/cxgb4/sge.c | 5 +++-- > 3 files changed, 21 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h > index ef4be78..09ea62e 100644 > --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h > +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h > @@ -529,6 +529,7 @@ enum { /* adapter flags */ > USING_SOFT_PARAMS = (1 << 6), > MASTER_PF = (1 << 7), > FW_OFLD_CONN = (1 << 9), > + ROOT_NO_RELAXED_ORDERING = (1 << 10), > }; > > enum { > diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c > index e403fa1..391e484 100644 > --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c > +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c > @@ -4654,11 +4654,6 @@ static void print_port_info(const struct net_device *dev) > dev->name, adap->params.vpd.id, adap->name, buf); > } > > -static void enable_pcie_relaxed_ordering(struct pci_dev *dev) > -{ > - pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN); > -} > - > /* > * Free the following resources: > * - memory used for tables > @@ -4908,7 +4903,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) > } > > pci_enable_pcie_error_reporting(pdev); > - enable_pcie_relaxed_ordering(pdev); > pci_set_master(pdev); > pci_save_state(pdev); > > @@ -4947,6 +4941,23 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) > adapter->msg_enable = DFLT_MSG_ENABLE; > memset(adapter->chan_map, 0xff, sizeof(adapter->chan_map)); > > + /* If possible, we use PCIe Relaxed Ordering Attribute to deliver > + * Ingress Packet Data to Free List Buffers in order to allow for > + * chipset performance optimizations between the Root Complex and > + * Memory Controllers. (Messages to the associated Ingress Queue > + * notifying new Packet Placement in the Free Lists Buffers will be > + * send without the Relaxed Ordering Attribute thus guaranteeing that > + * all preceding PCIe Transaction Layer Packets will be processed > + * first.) But some Root Complexes have various issues with Upstream > + * Transaction Layer Packets with the Relaxed Ordering Attribute set. > + * The PCIe devices which under the Root Complexes will be cleared the > + * Relaxed Ordering bit in the configuration space, So we check our > + * PCIe configuration space to see if it's flagged with advice against > + * using Relaxed Ordering. > + */ > + if (!pcie_relaxed_ordering_supported(pdev)) > + adapter->flags |= ROOT_NO_RELAXED_ORDERING; > + > spin_lock_init(&adapter->stats_lock); > spin_lock_init(&adapter->tid_release_lock); > spin_lock_init(&adapter->win0_lock); > diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c > index ede1220..4ef68f6 100644 > --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c > +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c > @@ -2719,6 +2719,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq, > struct fw_iq_cmd c; > struct sge *s = &adap->sge; > struct port_info *pi = netdev_priv(dev); > + int relaxed = !(adap->flags & ROOT_NO_RELAXED_ORDERING); > > /* Size needs to be multiple of 16, including status entry. */ > iq->size = roundup(iq->size, 16); > @@ -2772,8 +2773,8 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq, > > flsz = fl->size / 8 + s->stat_len / sizeof(struct tx_desc); > c.iqns_to_fl0congen |= htonl(FW_IQ_CMD_FL0PACKEN_F | > - FW_IQ_CMD_FL0FETCHRO_F | > - FW_IQ_CMD_FL0DATARO_F | > + FW_IQ_CMD_FL0FETCHRO_V(relaxed) | > + FW_IQ_CMD_FL0DATARO_V(relaxed) | > FW_IQ_CMD_FL0PADEN_F); > if (cong >= 0) > c.iqns_to_fl0congen |= > -- > 1.8.3.1 > >
On Thu, Jul 13, 2017 at 11:14 AM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > On Thu, Jul 13, 2017 at 7:21 AM, Ding Tianhong <dingtianhong@huawei.com> wrote: >> From: Casey Leedom <leedom@chelsio.com> >> >> cxgb4 Ethernet driver now queries PCIe configuration space to determine >> if it can send TLPs to it with the Relaxed Ordering Attribute set. >> >> Remove the enable_pcie_relaxed_ordering() to avoid enable PCIe Capability >> Device Control[Relaxed Ordering Enable] at probe routine, to make sure >> the driver will not send the Relaxed Ordering TLPs to the Root Complex which >> could not deal the Relaxed Ordering TLPs. >> >> Signed-off-by: Casey Leedom <leedom@chelsio.com> >> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> > > Ding, > > You can probably just drop this patch. If I am understanding Casey > correctly just the fact that the relaxed ordering enable bit is > cleared in the configuration should be enough to do this for the > device automatically. > > - Alex Actually I take that back. I hadn't caught the most recent parts of the thread. If this is good for Casey then this works for me. - Alex
[[ Sorry for the Double Send: I forgot to switch to Plain Text. Have I mentioned how much I hate modern Web-based email agents? :-) -- Casey ]] Yeah, I think this works for now. We'll stumble over what to do when we want to mix upstream TLPs without Relaxed Ordering Attributes directed at problematic Root Complexes, and Peer-to-Peer TLPs with Relaxed Ordering Attributes ... or vice versa depending on which target PCIe Device has issues with Relaxed Ordering. Thanks for all the work! Casey
Hi Casey, Alexander: Thanks for the great efforts from both of you, It looks like we have reached a consensus finally, could you please add a confirmation message just like Reviewed-by or something else, thanks. :) Ding On 2017/7/14 2:44, Casey Leedom wrote: > Yeah, I think this works for now. We'll stumble over what to do when we want to mix upstream TLPs without Relaxed Ordering Attributes directed at problematic Root Complexes, and Peer-to-Peer TLPs with Relaxed Ordering Attributes ... or vice versa depending on which target PCIe Device has issues with Relaxed Ordering. > > > Thanks for all the work! > > > Casey > >
Reviewed-by: Casey Leedom <leedom@chelsio.com>
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index ef4be78..09ea62e 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -529,6 +529,7 @@ enum { /* adapter flags */ USING_SOFT_PARAMS = (1 << 6), MASTER_PF = (1 << 7), FW_OFLD_CONN = (1 << 9), + ROOT_NO_RELAXED_ORDERING = (1 << 10), }; enum { diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index e403fa1..391e484 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -4654,11 +4654,6 @@ static void print_port_info(const struct net_device *dev) dev->name, adap->params.vpd.id, adap->name, buf); } -static void enable_pcie_relaxed_ordering(struct pci_dev *dev) -{ - pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN); -} - /* * Free the following resources: * - memory used for tables @@ -4908,7 +4903,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) } pci_enable_pcie_error_reporting(pdev); - enable_pcie_relaxed_ordering(pdev); pci_set_master(pdev); pci_save_state(pdev); @@ -4947,6 +4941,23 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) adapter->msg_enable = DFLT_MSG_ENABLE; memset(adapter->chan_map, 0xff, sizeof(adapter->chan_map)); + /* If possible, we use PCIe Relaxed Ordering Attribute to deliver + * Ingress Packet Data to Free List Buffers in order to allow for + * chipset performance optimizations between the Root Complex and + * Memory Controllers. (Messages to the associated Ingress Queue + * notifying new Packet Placement in the Free Lists Buffers will be + * send without the Relaxed Ordering Attribute thus guaranteeing that + * all preceding PCIe Transaction Layer Packets will be processed + * first.) But some Root Complexes have various issues with Upstream + * Transaction Layer Packets with the Relaxed Ordering Attribute set. + * The PCIe devices which under the Root Complexes will be cleared the + * Relaxed Ordering bit in the configuration space, So we check our + * PCIe configuration space to see if it's flagged with advice against + * using Relaxed Ordering. + */ + if (!pcie_relaxed_ordering_supported(pdev)) + adapter->flags |= ROOT_NO_RELAXED_ORDERING; + spin_lock_init(&adapter->stats_lock); spin_lock_init(&adapter->tid_release_lock); spin_lock_init(&adapter->win0_lock); diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index ede1220..4ef68f6 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -2719,6 +2719,7 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq, struct fw_iq_cmd c; struct sge *s = &adap->sge; struct port_info *pi = netdev_priv(dev); + int relaxed = !(adap->flags & ROOT_NO_RELAXED_ORDERING); /* Size needs to be multiple of 16, including status entry. */ iq->size = roundup(iq->size, 16); @@ -2772,8 +2773,8 @@ int t4_sge_alloc_rxq(struct adapter *adap, struct sge_rspq *iq, bool fwevtq, flsz = fl->size / 8 + s->stat_len / sizeof(struct tx_desc); c.iqns_to_fl0congen |= htonl(FW_IQ_CMD_FL0PACKEN_F | - FW_IQ_CMD_FL0FETCHRO_F | - FW_IQ_CMD_FL0DATARO_F | + FW_IQ_CMD_FL0FETCHRO_V(relaxed) | + FW_IQ_CMD_FL0DATARO_V(relaxed) | FW_IQ_CMD_FL0PADEN_F); if (cong >= 0) c.iqns_to_fl0congen |=