Message ID | 20241028025337.6372-6-ki.chiang65@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xhci: Some improvement for Etron xHCI host | expand |
Hi, That's a bug I'm familiar with. > Unplugging a USB3.0 webcam while streaming results in errors > like this Not only unplugging but also any random error due to EMI or bad cable. > If an error is detected while processing an one-TRB isoc TD, > the Etron xHC generates two transfer events for the TRB that > the error was detected on. The first event is "USB Transcation > Error", and the second event is "Success". IIRC, it wasn't just Transaction Errors but any sort of error, like Babble or Bandwidth Overrun. But not sure about Missed Service, etc. And IIRC I confirmed that it was *not* the case on Short Packet. Also, I'm 99% sure the problem is not limited to one-TRB TDs, but it occurs every time there is an error on the last TRB of any TD. > As a solution, we can set the flag after the first error event > and don't print the error message after the second event if the > flag is set. Yes, but I think it would be better to use error_mid_td instead of last_td_was_short, so that the TD is only freed on the final event, not on the first one. The spec is clear that we should only free TRBs when the xHC is done with them. Maybe it wouldn't be a problem in this case, and it surely wouldn't be worse than what happens with Etron today, but IMO it could be a real (even if rare) problem in other cases when this flag is used, so I would rather remove the flag and handle short packets as per spec. Regards, Michal
Hi, Thank you for the review. Michał Pecio <michal.pecio@gmail.com> 於 2024年10月28日 週一 下午5:54寫道: > > Hi, > > That's a bug I'm familiar with. > > > Unplugging a USB3.0 webcam while streaming results in errors > > like this > > Not only unplugging but also any random error due to EMI or bad cable. > > > If an error is detected while processing an one-TRB isoc TD, > > the Etron xHC generates two transfer events for the TRB that > > the error was detected on. The first event is "USB Transcation > > Error", and the second event is "Success". > > IIRC, it wasn't just Transaction Errors but any sort of error, like > Babble or Bandwidth Overrun. But not sure about Missed Service, etc. > > And IIRC I confirmed that it was *not* the case on Short Packet. Yes, it is not. > > Also, I'm 99% sure the problem is not limited to one-TRB TDs, but > it occurs every time there is an error on the last TRB of any TD. Yes, this can happen, I didn't account for this scenario. > > > As a solution, we can set the flag after the first error event > > and don't print the error message after the second event if the > > flag is set. > > Yes, but I think it would be better to use error_mid_td instead of > last_td_was_short, so that the TD is only freed on the final event, > not on the first one. > > The spec is clear that we should only free TRBs when the xHC is done > with them. Maybe it wouldn't be a problem in this case, and it surely > wouldn't be worse than what happens with Etron today, but IMO it could > be a real (even if rare) problem in other cases when this flag is used, > so I would rather remove the flag and handle short packets as per spec. Thank you for the explanation and suggestion. Maybe I should start trying to use error_mid_td to solve this problem. > > Regards, > Michal Thanks, Kuangyi Chiang
On 28.10.2024 4.53, Kuangyi Chiang wrote: > Unplugging a USB3.0 webcam while streaming results in errors > like this: > > [ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 > [ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 > [ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 > [ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 > > If an error is detected while processing an one-TRB isoc TD, > the Etron xHC generates two transfer events for the TRB that > the error was detected on. The first event is "USB Transcation > Error", and the second event is "Success". > > The xHCI driver will handle the TD after the first event and > remove it from its internal list, and then print an "Transfer > event TRB DMA ptr not part of current TD" error message after > the second event. > > As a solution, we can set the flag after the first error event > and don't print the error message after the second event if > the flag is set. > > Commit ad808333d820 ("Intel xhci: Ignore spurious successful > event.") implements a similar mechanism that we can reuse to > solve this problem since short transfer and transfer error > doesn't occur concurrently. Also, rename the flag to make it > more meaningful. > > Check if the XHCI_ETRON_HOST quirk flag is set before invoking > the workaround in process_isoc_td(). > > This patch doesn't affect other host controllers that have the > XHCI_SPURIOUS_SUCCESS quirk flag applied. > > Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> I'm leaving this out of the series due to both ongoing discussion about this patch, and because it conflicts with another series touching handle_tx_event() All other patches in series are added Thanks Mathias
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 9e132b08bfde..33fa8a11c934 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2437,6 +2437,10 @@ static int process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep, sum_trbs_for_length = true; if (ep_trb != td->last_trb) td->error_mid_td = true; + if ((xhci->quirks & XHCI_ETRON_HOST) && + td->urb->dev->speed >= USB_SPEED_SUPER && + td->first_trb == td->last_trb) + ep_ring->spurious_event = true; break; case COMP_STOPPED: sum_trbs_for_length = true; @@ -2655,8 +2659,8 @@ static int handle_tx_event(struct xhci_hcd *xhci, case COMP_SUCCESS: if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) { trb_comp_code = COMP_SHORT_PACKET; - xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n", - slot_id, ep_index, ep_ring->last_td_was_short); + xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with spurious event %d\n", + slot_id, ep_index, ep_ring->spurious_event); } break; case COMP_SHORT_PACKET: @@ -2801,13 +2805,13 @@ static int handle_tx_event(struct xhci_hcd *xhci, if (list_empty(&ep_ring->td_list)) { /* * Don't print wanings if ring is empty due to a stopped endpoint generating an - * extra completion event if the device was suspended. Or, a event for the last TRB - * of a short TD we already got a short event for. The short TD is already removed - * from the TD list. + * extra completion event if the device was suspended. Or, the spurious event flag + * is set at the last TD of the TD list due to a short transfer or an one-TRB isoc + * TD error, and such TD is already removed from the TD list. */ if (trb_comp_code != COMP_STOPPED && trb_comp_code != COMP_STOPPED_LENGTH_INVALID && - !ep_ring->last_td_was_short) { + !ep_ring->spurious_event) { xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n", slot_id, ep_index); } @@ -2851,11 +2855,11 @@ static int handle_tx_event(struct xhci_hcd *xhci, /* * Some hosts give a spurious success event after a short - * transfer. Ignore it. + * transfer or an one-TRB isoc TD error. Ignore it. */ if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) && - ep_ring->last_td_was_short) { - ep_ring->last_td_was_short = false; + ep_ring->spurious_event) { + ep_ring->spurious_event = false; return 0; } @@ -2884,9 +2888,9 @@ static int handle_tx_event(struct xhci_hcd *xhci, } while (ep->skip); if (trb_comp_code == COMP_SHORT_PACKET) - ep_ring->last_td_was_short = true; + ep_ring->spurious_event = true; else - ep_ring->last_td_was_short = false; + ep_ring->spurious_event = false; ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)]; trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 4f5b732e8944..dca9091b8134 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1359,7 +1359,7 @@ struct xhci_ring { unsigned int num_trbs_free; /* used only by xhci DbC */ unsigned int bounce_buf_len; enum xhci_ring_type type; - bool last_td_was_short; + bool spurious_event; struct radix_tree_root *trb_address_map; };
Unplugging a USB3.0 webcam while streaming results in errors like this: [ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 [ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0 If an error is detected while processing an one-TRB isoc TD, the Etron xHC generates two transfer events for the TRB that the error was detected on. The first event is "USB Transcation Error", and the second event is "Success". The xHCI driver will handle the TD after the first event and remove it from its internal list, and then print an "Transfer event TRB DMA ptr not part of current TD" error message after the second event. As a solution, we can set the flag after the first error event and don't print the error message after the second event if the flag is set. Commit ad808333d820 ("Intel xhci: Ignore spurious successful event.") implements a similar mechanism that we can reuse to solve this problem since short transfer and transfer error doesn't occur concurrently. Also, rename the flag to make it more meaningful. Check if the XHCI_ETRON_HOST quirk flag is set before invoking the workaround in process_isoc_td(). This patch doesn't affect other host controllers that have the XHCI_SPURIOUS_SUCCESS quirk flag applied. Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> --- drivers/usb/host/xhci-ring.c | 26 +++++++++++++++----------- drivers/usb/host/xhci.h | 2 +- 2 files changed, 16 insertions(+), 12 deletions(-)