Message ID | 20220930091050.193096-1-chris.chiu@canonical.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI/ASPM: Make SUNIX serial card acceptable latency unlimited | expand |
On Fri, Sep 30, 2022 at 05:10:50PM +0800, Chris Chiu wrote: > SUNIX serial card advertise L1 acceptable L0S exit latency to be > < 2us, L1 < 32us, but the link capability shows they're unlimited. > > It fails the latency check and prohibits the ASPM L1 from being > enabled. The L1 acceptable latency quirk fixes the issue. Hi Chris, help me understand what's going on here. The "Endpoint L1 Acceptable Latency" field in Device Capabilities is described like this (PCIe r6.0, sec 7.5.3.3): This field indicates the acceptable latency that an Endpoint can withstand due to the transition from L1 state to the L0 state. It is essentially an indirect measure of the Endpoint’s internal buffering. Power management software uses the reported L1 Acceptable Latency number to compare against the L1 Exit Latencies reported (see below) by all components comprising the data path from this Endpoint to the Root Complex Root Port to determine whether ASPM L1 entry can be used with no loss of performance. The "L1 Exit Latency" in Link Capabilities: This field indicates the L1 Exit Latency for the given PCI Express Link. The value reported indicates the length of time this Port requires to complete transition from ASPM L1 to L0. Apparently the SUNIX device advertises in Dev Cap that it can tolerate a maximum of 32us of L1 Exit Latency for the entire path from the SUNIX device to the Root Port, and in Link Cap that the SUNIX device itself may take more than 64us to exit L1. If that's accurate, then we should not enable L1 for that device because using L1 may cause buffer overflows, e.g., dropped characters. Per 03038d84ace7 ("PCI/ASPM: Make Intel DG2 L1 acceptable latency unlimited"), the existing users of aspm_l1_acceptable_latency() are graphics devices where I assume there would be little data coming from the device and buffering would not be an issue. It doesn't seem plausible to me that a serial device, where there is a continuous stream of incoming data, could tolerate an *unlimited* exit latency. I could certainly believe that Link Cap advertises "> 64us" of L1 Exit Latency when it really should advertise "< 32us" or something. But I don't know how we could be confident in the correct value without knowledge of the device design. Bjorn > Signed-off-by: Chris Chiu <chris.chiu@canonical.com> > --- > drivers/pci/quirks.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index 4944798e75b5..e1663e43846e 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5955,4 +5955,5 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency); > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUNIX, PCI_DEVICE_ID_SUNIX_1999, aspm_l1_acceptable_latency); > #endif > -- > 2.25.1 >
On Fri, Sep 30, 2022 at 11:18 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Fri, Sep 30, 2022 at 05:10:50PM +0800, Chris Chiu wrote: > > SUNIX serial card advertise L1 acceptable L0S exit latency to be > > < 2us, L1 < 32us, but the link capability shows they're unlimited. > > > > It fails the latency check and prohibits the ASPM L1 from being > > enabled. The L1 acceptable latency quirk fixes the issue. > > Hi Chris, help me understand what's going on here. > > The "Endpoint L1 Acceptable Latency" field in Device Capabilities is > described like this (PCIe r6.0, sec 7.5.3.3): > > This field indicates the acceptable latency that an Endpoint can > withstand due to the transition from L1 state to the L0 state. It is > essentially an indirect measure of the Endpoint’s internal > buffering. > > Power management software uses the reported L1 Acceptable Latency > number to compare against the L1 Exit Latencies reported (see below) > by all components comprising the data path from this Endpoint to the > Root Complex Root Port to determine whether ASPM L1 entry can be > used with no loss of performance. > > The "L1 Exit Latency" in Link Capabilities: > > This field indicates the L1 Exit Latency for the given PCI Express > Link. The value reported indicates the length of time this Port > requires to complete transition from ASPM L1 to L0. > > Apparently the SUNIX device advertises in Dev Cap that it can tolerate > a maximum of 32us of L1 Exit Latency for the entire path from the > SUNIX device to the Root Port, and in Link Cap that the SUNIX device > itself may take more than 64us to exit L1. > > If that's accurate, then we should not enable L1 for that device > because using L1 may cause buffer overflows, e.g., dropped characters. > > Per 03038d84ace7 ("PCI/ASPM: Make Intel DG2 L1 acceptable latency > unlimited"), the existing users of aspm_l1_acceptable_latency() are > graphics devices where I assume there would be little data coming from > the device and buffering would not be an issue. > > It doesn't seem plausible to me that a serial device, where there is a > continuous stream of incoming data, could tolerate an *unlimited* exit > latency. > > I could certainly believe that Link Cap advertises "> 64us" of L1 Exit > Latency when it really should advertise "< 32us" or something. But I > don't know how we could be confident in the correct value without > knowledge of the device design. > > Bjorn Hi Bjorn, Thanks for the clear explanation. I understand your concern and I'll try to reach the vendor for their comment about the device design. But the value "unlimited" for L1 exit latency with specified L1 acceptable latency on a self-claimed "ASPM L1" capable device really looks weird to me, I'd rather assume the 32us limit in DevCap is actually for LinkCap (L1 exit latency), and the acceptable latency is actually unlimited. I'll try to ask SUNIX if they intentionally program the latency this way and expect the device is unlikely to enter ASPM L1. Or they just accidentally program it with the incorrect value. Will keep updating here. Thanks Chris > > > Signed-off-by: Chris Chiu <chris.chiu@canonical.com> > > --- > > drivers/pci/quirks.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > index 4944798e75b5..e1663e43846e 100644 > > --- a/drivers/pci/quirks.c > > +++ b/drivers/pci/quirks.c > > @@ -5955,4 +5955,5 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency > > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency); > > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); > > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUNIX, PCI_DEVICE_ID_SUNIX_1999, aspm_l1_acceptable_latency); > > #endif > > -- > > 2.25.1 > >
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 4944798e75b5..e1663e43846e 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5955,4 +5955,5 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUNIX, PCI_DEVICE_ID_SUNIX_1999, aspm_l1_acceptable_latency); #endif
SUNIX serial card advertise L1 acceptable L0S exit latency to be < 2us, L1 < 32us, but the link capability shows they're unlimited. It fails the latency check and prohibits the ASPM L1 from being enabled. The L1 acceptable latency quirk fixes the issue. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> --- drivers/pci/quirks.c | 1 + 1 file changed, 1 insertion(+)