[v6,20/22] usb: dwc2: host: Properly set even/odd frame

Message ID	1454034013-24657-21-git-send-email-dianders@chromium.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-rockchip-bounces+patchwork-linux-rockchip=patchwork.kernel.org@lists.infradead.org> From: Douglas Anderson <dianders@chromium.org> To: John Youn <John.Youn@synopsys.com>, balbi@ti.com, kever.yang@rock-chips.com Subject: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame Date: Thu, 28 Jan 2016 18:20:11 -0800 Message-Id: <1454034013-24657-21-git-send-email-dianders@chromium.org> In-Reply-To: <1454034013-24657-1-git-send-email-dianders@chromium.org> References: <1454034013-24657-1-git-send-email-dianders@chromium.org> Precedence: list Cc: huangtao@rock-chips.com, stefan.wahren@i2se.com, heiko@sntech.de, johnyoun@synopsys.com, gregkh@linuxfoundation.org, ming.lei@canonical.com, linux-usb@vger.kernel.org, Douglas Anderson <dianders@chromium.org>, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, yousaf.kaukab@intel.com, stern@rowland.harvard.edu, linux-rpi-kernel@lists.infradead.org, gregory.herrero@intel.com, william.wu@rock-chips.com, Julius Werner <jwerner@chromium.org>, dinguyen@opensource.altera.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "Linux-rockchip" <linux-rockchip-bounces@lists.infradead.org> Errors-To: linux-rockchip-bounces+patchwork-linux-rockchip=patchwork.kernel.org@lists.infradead.org

Message ID

1454034013-24657-21-git-send-email-dianders@chromium.org (mailing list archive)

State

New, archived

Headers

From: Douglas Anderson <dianders@chromium.org>
To: John Youn <John.Youn@synopsys.com>, balbi@ti.com,
	kever.yang@rock-chips.com
Subject: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
Date: Thu, 28 Jan 2016 18:20:11 -0800
Message-Id: <1454034013-24657-21-git-send-email-dianders@chromium.org>
In-Reply-To: <1454034013-24657-1-git-send-email-dianders@chromium.org>
References: <1454034013-24657-1-git-send-email-dianders@chromium.org>
Precedence: list
Cc: huangtao@rock-chips.com, stefan.wahren@i2se.com, heiko@sntech.de,
	johnyoun@synopsys.com, gregkh@linuxfoundation.org,
	ming.lei@canonical.com, 
	linux-usb@vger.kernel.org, Douglas Anderson <dianders@chromium.org>, 
	linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org,
	yousaf.kaukab@intel.com, stern@rowland.harvard.edu,
	linux-rpi-kernel@lists.infradead.org, gregory.herrero@intel.com,
	william.wu@rock-chips.com, Julius Werner <jwerner@chromium.org>,
	dinguyen@opensource.altera.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "Linux-rockchip" <linux-rockchip-bounces@lists.infradead.org>
Errors-To: linux-rockchip-bounces+patchwork-linux-rockchip=patchwork.kernel.org@lists.infradead.org

Commit Message

Doug Anderson Jan. 29, 2016, 2:20 a.m. UTC

When setting up ISO and INT transfers dwc2 needs to specify whether the
transfer is for an even or an odd frame (or microframe if the controller
is running in high speed mode).

The controller appears to use this as a simple way to figure out if a
transfer should happen right away (in the current microframe) or should
happen at the start of the next microframe.  Said another way:

- If you set "odd" and the current frame number is odd it appears that
  the controller will try to transfer right away.  Same thing if you set
  "even" and the current frame number is even.
- If the oddness you set and the oddness of the frame number are
  _different_, the transfer will be delayed until the frame number
  changes.

As I understand it, the above technique allows you to plan ahead of time
where possible by always working on the next frame.  ...but it still
allows you to properly respond immediately to things that happened in
the previous frame.

The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
It always looked at the frame number and setup the transfer to happen in
the next frame.  In some cases that meant that certain transactions
would be transferred in the wrong frame.

We'll try our best to set the even / odd to do the transfer in the
scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
We'll also modify the scheduler code to handle this and not try to
schedule a second transfer for the same frame.

Note that this change relies on the work to redo the microframe
scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
in scheduler") but it works even better after ("usb: dwc2: host: Totally
redo the microframe scheduler").

With this change my stressful USB test (USB webcam + USB audio +
keyboards) has less audio crackling than before.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Properly set even/odd frame new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.c      | 92 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/usb/dwc2/hcd_queue.c | 11 +++++-
 2 files changed, 100 insertions(+), 3 deletions(-)

Comments

Kever Yang Feb. 2, 2016, 7:46 a.m. UTC | #1

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> When setting up ISO and INT transfers dwc2 needs to specify whether the
> transfer is for an even or an odd frame (or microframe if the controller
> is running in high speed mode).
>
> The controller appears to use this as a simple way to figure out if a
> transfer should happen right away (in the current microframe) or should
> happen at the start of the next microframe.  Said another way:
>
> - If you set "odd" and the current frame number is odd it appears that
>    the controller will try to transfer right away.  Same thing if you set
>    "even" and the current frame number is even.
> - If the oddness you set and the oddness of the frame number are
>    _different_, the transfer will be delayed until the frame number
>    changes.
>
> As I understand it, the above technique allows you to plan ahead of time
> where possible by always working on the next frame.  ...but it still
> allows you to properly respond immediately to things that happened in
> the previous frame.
>
> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
> It always looked at the frame number and setup the transfer to happen in
> the next frame.  In some cases that meant that certain transactions
> would be transferred in the wrong frame.
>
> We'll try our best to set the even / odd to do the transfer in the
> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
> We'll also modify the scheduler code to handle this and not try to
> schedule a second transfer for the same frame.
>
> Note that this change relies on the work to redo the microframe
> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
> in scheduler") but it works even better after ("usb: dwc2: host: Totally
> redo the microframe scheduler").
>
> With this change my stressful USB test (USB webcam + USB audio +
> keyboards) has less audio crackling than before.
Seems this really help for your case?

Do you check if the transfer can happen right in the current frame? I 
know it's
quite difficult to check it, but this changes what I know for the dwc core
schedule the transaction.

In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso 
IN/OUT)
in DMA Mode, the normal Interrupt OUT operation says:
The DWC_otg host attempts to send out the OUT token in the beginning of next
odd frame/microframe.

So I'm confuse about if the dwc core can do the transaction at the same 
frame
of host channel initialized or not.

Thanks,
- Kever

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
> ---
> Changes in v6:
> - Add Heiko's Tested-by.
> - Add Stefan's Tested-by.
>
> Changes in v5: None
> Changes in v4:
> - Properly set even/odd frame new for v4.
>
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/core.c      | 92 +++++++++++++++++++++++++++++++++++++++++++-
>   drivers/usb/dwc2/hcd_queue.c | 11 +++++-
>   2 files changed, 100 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
> index a5db20f12ee4..c143f26bd9d9 100644
> --- a/drivers/usb/dwc2/core.c
> +++ b/drivers/usb/dwc2/core.c
> @@ -1703,9 +1703,97 @@ static void dwc2_hc_set_even_odd_frame(struct dwc2_hsotg *hsotg,
>   {
>   	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
>   	    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
> -		/* 1 if _next_ frame is odd, 0 if it's even */
> -		if (!(dwc2_hcd_get_frame_number(hsotg) & 0x1))
> +		int host_speed;
> +		int xfer_ns;
> +		int xfer_us;
> +		int bytes_in_fifo;
> +		u16 fifo_space;
> +		u16 frame_number;
> +		u16 wire_frame;
> +
> +		/*
> +		 * Try to figure out if we're an even or odd frame. If we set
> +		 * even and the current frame number is even the the transfer
> +		 * will happen immediately.  Similar if both are odd. If one is
> +		 * even and the other is odd then the transfer will happen when
> +		 * the frame number ticks.
> +		 *
> +		 * There's a bit of a balancing act to get this right.
> +		 * Sometimes we may want to send data in the current frame (AK
> +		 * right away).  We might want to do this if the frame number
> +		 * _just_ ticked, but we might also want to do this in order
> +		 * to continue a split transaction that happened late in a
> +		 * microframe (so we didn't know to queue the next transfer
> +		 * until the frame number had ticked).  The problem is that we
> +		 * need a lot of knowledge to know if there's actually still
> +		 * time to send things or if it would be better to wait until
> +		 * the next frame.
> +		 *
> +		 * We can look at how much time is left in the current frame
> +		 * and make a guess about whether we'll have time to transfer.
> +		 * We'll do that.
> +		 */
> +
> +		/* Get speed host is running at */
> +		host_speed = (chan->speed != USB_SPEED_HIGH &&
> +			      !chan->do_split) ? chan->speed : USB_SPEED_HIGH;
> +
> +		/* See how many bytes are in the periodic FIFO right now */
> +		fifo_space = (dwc2_readl(hsotg->regs + HPTXSTS) &
> +			      TXSTS_FSPCAVAIL_MASK) >> TXSTS_FSPCAVAIL_SHIFT;
> +		bytes_in_fifo = sizeof(u32) *
> +				(hsotg->core_params->host_perio_tx_fifo_size -
> +				 fifo_space);
> +
> +		/*
> +		 * Roughly estimate bus time for everything in the periodic
> +		 * queue + our new transfer.  This is "rough" because we're
> +		 * using a function that makes takes into account IN/OUT
> +		 * and INT/ISO and we're just slamming in one value for all
> +		 * transfers.  This should be an over-estimate and that should
> +		 * be OK, but we can probably tighten it.
> +		 */
> +		xfer_ns = usb_calc_bus_time(host_speed, false, false,
> +					    chan->xfer_len + bytes_in_fifo);
> +		xfer_us = NS_TO_US(xfer_ns);
> +
> +		/* See what frame number we'll be at by the time we finish */
> +		frame_number = dwc2_hcd_get_future_frame_number(hsotg, xfer_us);
> +
> +		/* This is when we were scheduled to be on the wire */
> +		wire_frame = dwc2_frame_num_inc(chan->qh->next_active_frame, 1);
> +
> +		/*
> +		 * If we'd finish _after_ the frame we're scheduled in then
> +		 * it's hopeless.  Just schedule right away and hope for the
> +		 * best.  Note that it _might_ be wise to call back into the
> +		 * scheduler to pick a better frame, but this is better than
> +		 * nothing.
> +		 */
> +		if (dwc2_frame_num_gt(frame_number, wire_frame)) {
> +			dwc2_sch_vdbg(hsotg,
> +				      "QH=%p EO MISS fr=%04x=>%04x (%+d)\n",
> +				      chan->qh, wire_frame, frame_number,
> +				      dwc2_frame_num_dec(frame_number,
> +							 wire_frame));
> +			wire_frame = frame_number;
> +
> +			/*
> +			 * We picked a different frame number; communicate this
> +			 * back to the scheduler so it doesn't try to schedule
> +			 * another in the same frame.
> +			 *
> +			 * Remember that next_active_frame is 1 before the wire
> +			 * frame.
> +			 */
> +			chan->qh->next_active_frame =
> +				dwc2_frame_num_dec(frame_number, 1);
> +		}
> +
> +		if (wire_frame & 1)
>   			*hcchar |= HCCHAR_ODDFRM;
> +		else
> +			*hcchar &= ~HCCHAR_ODDFRM;
>   	}
>   }
>   
> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
> index 3abb34a5fc5b..5f909747b5a4 100644
> --- a/drivers/usb/dwc2/hcd_queue.c
> +++ b/drivers/usb/dwc2/hcd_queue.c
> @@ -985,6 +985,14 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
>   	 *   and next_active_frame are always 1 frame before we want things
>   	 *   to be active and we assume we can still get scheduled in the
>   	 *   current frame number.
> +	 * - It's possible for start_active_frame (now incremented) to be
> +	 *   next_active_frame if we got an EO MISS (even_odd miss) which
> +	 *   basically means that we detected there wasn't enough time for
> +	 *   the last packet and dwc2_hc_set_even_odd_frame() rescheduled us
> +	 *   at the last second.  We want to make sure we don't schedule
> +	 *   another transfer for the same frame.  My test webcam doesn't seem
> +	 *   terribly upset by missing a transfer but really doesn't like when
> +	 *   we do two transfers in the same frame.
>   	 * - Some misses are expected.  Specifically, in order to work
>   	 *   perfectly dwc2 really needs quite spectacular interrupt latency
>   	 *   requirements.  It needs to be able to handle its interrupts
> @@ -995,7 +1003,8 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
>   	 *   guarantee that a system will have interrupt latency < 125 us, so
>   	 *   we have to be robust to some misses.
>   	 */
> -	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
> +	if (qh->start_active_frame == qh->next_active_frame ||
> +	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
>   		u16 ideal_start = qh->start_active_frame;
>   
>   		/* Adjust interval as per gcd with plan length. */

Doug Anderson Feb. 2, 2016, 10:47 p.m. UTC | #2

Kever,

On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
> Doug,
>
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>> transfer is for an even or an odd frame (or microframe if the controller
>> is running in high speed mode).
>>
>> The controller appears to use this as a simple way to figure out if a
>> transfer should happen right away (in the current microframe) or should
>> happen at the start of the next microframe.  Said another way:
>>
>> - If you set "odd" and the current frame number is odd it appears that
>>    the controller will try to transfer right away.  Same thing if you set
>>    "even" and the current frame number is even.
>> - If the oddness you set and the oddness of the frame number are
>>    _different_, the transfer will be delayed until the frame number
>>    changes.
>>
>> As I understand it, the above technique allows you to plan ahead of time
>> where possible by always working on the next frame.  ...but it still
>> allows you to properly respond immediately to things that happened in
>> the previous frame.
>>
>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>> It always looked at the frame number and setup the transfer to happen in
>> the next frame.  In some cases that meant that certain transactions
>> would be transferred in the wrong frame.
>>
>> We'll try our best to set the even / odd to do the transfer in the
>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>> We'll also modify the scheduler code to handle this and not try to
>> schedule a second transfer for the same frame.
>>
>> Note that this change relies on the work to redo the microframe
>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>> redo the microframe scheduler").
>>
>> With this change my stressful USB test (USB webcam + USB audio +
>> keyboards) has less audio crackling than before.
>
> Seems this really help for your case?

Yes, I believe it does.  Of course my test case is pretty "black box"
for the most part in that I play music on youtube while having a
webcam open and several USB input devices connected.  I then try to
decide whether I hear more static or less static.  ...clearly a less
subjective test would be better...

* I tried with http://crosreview.com/325451 (see below) and I hear
more static with "use_old = true" than with "use_old = "false".

* I tried with this entire patch reverted and I hear about the same
static as with "use_old = true".

Note that counting reported MISS lines from my logging also shows that
the new code is better...

> Do you check if the transfer can happen right in the current frame? I know
> it's
> quite difficult to check it, but this changes what I know for the dwc core
> schedule the transaction.

Yes.  I just tried again, too.  I coded up
<https://chromium-review.googlesource.com/325451> and included it.  I
then opened up a USB webcam.

With things set to the old way:

  115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
  115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
  115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
  115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
  115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
  115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
  115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
  115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
  115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
  115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
  115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
  115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
  115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
  115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
  115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
  115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
  115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
  115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3

With the new way:

   87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
   87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
   87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
   87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
   87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
   87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
   87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
   87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
   87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
   87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb

Note that with my TEST-ONLY patch the old way is still _slightly_
different in that I still communicate back to the scheduler with:

  chan->qh->next_active_frame = now_frame;

The old code didn't used to do that.  If I don't do that then you
you'll just stay in an inconsistent state for a while where things are
going on the wire 1 frame later than we think they are.

Also note that above you can see that the new way is indeed able to
schedule things in the current microframe.  Looking one line at a
time:

   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6

QH e2fd7880 is going straight to the ready queue.  Actual frame number
in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
want to give it to hardware in 32e6 and wire frame is 32e7.

   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7

Frame number in hardware is now 32e8.  We'd like to give the next
transfer to hardware in 32e7 to transfer on the wire at 32e8, but
that's obviously impossible.  We will try to give it right away.

   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)

Showing a difference in the old way.  We'll choose "even" to have the
packet go on the wire (expecting 32e8).

   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8

We got a response back and are ready to schedule the next transfer and
it's still 32e8!  That means that transfer must have happened (as
expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
for 32e9 wire.

   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0

Now at hardware 32e9 and ready to schedule the next...

> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
> IN/OUT)
> in DMA Mode, the normal Interrupt OUT operation says:
> The DWC_otg host attempts to send out the OUT token in the beginning of next
> odd frame/microframe.
>
> So I'm confuse about if the dwc core can do the transaction at the same
> frame
> of host channel initialized or not.

The docbook is obviously way too terse here, but the above experiment
shows that the hardware is designed in the only sane way that it could
be designed.

Why do I say that this is the only sane way for the hardware to work?
I think all the following is true (please correct any errors):

A) HW only lets you specify even/odd which means you choose between
two frame to send the packet.  Two possible ways HW could be
implemented: "sane" way means you can send a packet in frame "x" and
"x + 1".  "insane" way means you can send a packet in frame "x + 1"
and "x + 2" but not frame "x"

B) In some cases (especially with regards to SPLIT transfers), we need
to use the result of a transfer in uFrame "x" to decide what to do
about uFrame "x + 1".  Specifically for IN transfers I think we can't
know for sure whether we'll get back all of our data in uFrame "x" or
whether we'll only get part of the data and need uFrame "x + 1".

C) It's possible to schedule 100us worth of periodic transfers in one
125us uFrame.

D) We can't know the result of a transfer until that transfer is done.

So above basically means that we might have a periodic transfer where
we get the result of the transfer 100us into a uFrame.  We've now got
to quickly queue up the transfer for the next uFrame.  If hardware was
designed in the "insane" way then we'd need an interrupt latency of <
25 us since once the frame ticked we'd no longer be able to schedule.
If hardware was designed in the "sane" way then we'd "only" need an
interrupt latency of 125 us since we could continue to schedule even
partway through the current frame.

Also note that if there's any chance that a periodic transfer ends
later than 100 us into a frame (like if a non-periodic transfer snuck
in there because we were out of periodic channels) then the above
problem becomes even more extreme.

-Doug

Kever Yang Feb. 3, 2016, 7:47 a.m. UTC | #3

Doug,

Thanks for your detail debug information, pls add my Reviewed-by for 
this patch.

Thanks,
- Kever
On 02/03/2016 06:47 AM, Doug Anderson wrote:
> Kever,
>
> On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
>> Doug,
>>
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>>> transfer is for an even or an odd frame (or microframe if the controller
>>> is running in high speed mode).
>>>
>>> The controller appears to use this as a simple way to figure out if a
>>> transfer should happen right away (in the current microframe) or should
>>> happen at the start of the next microframe.  Said another way:
>>>
>>> - If you set "odd" and the current frame number is odd it appears that
>>>     the controller will try to transfer right away.  Same thing if you set
>>>     "even" and the current frame number is even.
>>> - If the oddness you set and the oddness of the frame number are
>>>     _different_, the transfer will be delayed until the frame number
>>>     changes.
>>>
>>> As I understand it, the above technique allows you to plan ahead of time
>>> where possible by always working on the next frame.  ...but it still
>>> allows you to properly respond immediately to things that happened in
>>> the previous frame.
>>>
>>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>>> It always looked at the frame number and setup the transfer to happen in
>>> the next frame.  In some cases that meant that certain transactions
>>> would be transferred in the wrong frame.
>>>
>>> We'll try our best to set the even / odd to do the transfer in the
>>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>>> We'll also modify the scheduler code to handle this and not try to
>>> schedule a second transfer for the same frame.
>>>
>>> Note that this change relies on the work to redo the microframe
>>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>>> redo the microframe scheduler").
>>>
>>> With this change my stressful USB test (USB webcam + USB audio +
>>> keyboards) has less audio crackling than before.
>> Seems this really help for your case?
> Yes, I believe it does.  Of course my test case is pretty "black box"
> for the most part in that I play music on youtube while having a
> webcam open and several USB input devices connected.  I then try to
> decide whether I hear more static or less static.  ...clearly a less
> subjective test would be better...
>
> * I tried with http://crosreview.com/325451 (see below) and I hear
> more static with "use_old = true" than with "use_old = "false".
>
> * I tried with this entire patch reverted and I hear about the same
> static as with "use_old = true".
>
> Note that counting reported MISS lines from my logging also shows that
> the new code is better...
>
>
>> Do you check if the transfer can happen right in the current frame? I know
>> it's
>> quite difficult to check it, but this changes what I know for the dwc core
>> schedule the transaction.
> Yes.  I just tried again, too.  I coded up
> <https://chromium-review.googlesource.com/325451> and included it.  I
> then opened up a USB webcam.
>
> With things set to the old way:
>
>    115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
>    115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
>    115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
>    115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
>    115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
>    115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
>    115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
>    115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
>    115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
>    115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
>    115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
>    115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
>    115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
>    115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
>    115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
>    115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
>    115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
>    115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3
>
> With the new way:
>
>     87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
>     87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
>     87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
>     87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
>     87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>     87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
>     87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
>     87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
>     87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
>     87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb
>
>
> Note that with my TEST-ONLY patch the old way is still _slightly_
> different in that I still communicate back to the scheduler with:
>
>    chan->qh->next_active_frame = now_frame;
>
> The old code didn't used to do that.  If I don't do that then you
> you'll just stay in an inconsistent state for a while where things are
> going on the wire 1 frame later than we think they are.
>
>
> Also note that above you can see that the new way is indeed able to
> schedule things in the current microframe.  Looking one line at a
> time:
>
>
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>
> QH e2fd7880 is going straight to the ready queue.  Actual frame number
> in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
> want to give it to hardware in 32e6 and wire frame is 32e7.
>
>
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>
> Frame number in hardware is now 32e8.  We'd like to give the next
> transfer to hardware in 32e7 to transfer on the wire at 32e8, but
> that's obviously impossible.  We will try to give it right away.
>
>
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>
> Showing a difference in the old way.  We'll choose "even" to have the
> packet go on the wire (expecting 32e8).
>
>
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>
> We got a response back and are ready to schedule the next transfer and
> it's still 32e8!  That means that transfer must have happened (as
> expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
> for 32e9 wire.
>
>
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>
> Now at hardware 32e9 and ready to schedule the next...
>
>
>
>> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
>> IN/OUT)
>> in DMA Mode, the normal Interrupt OUT operation says:
>> The DWC_otg host attempts to send out the OUT token in the beginning of next
>> odd frame/microframe.
>>
>> So I'm confuse about if the dwc core can do the transaction at the same
>> frame
>> of host channel initialized or not.
> The docbook is obviously way too terse here, but the above experiment
> shows that the hardware is designed in the only sane way that it could
> be designed.
>
> Why do I say that this is the only sane way for the hardware to work?
> I think all the following is true (please correct any errors):
>
> A) HW only lets you specify even/odd which means you choose between
> two frame to send the packet.  Two possible ways HW could be
> implemented: "sane" way means you can send a packet in frame "x" and
> "x + 1".  "insane" way means you can send a packet in frame "x + 1"
> and "x + 2" but not frame "x"
>
> B) In some cases (especially with regards to SPLIT transfers), we need
> to use the result of a transfer in uFrame "x" to decide what to do
> about uFrame "x + 1".  Specifically for IN transfers I think we can't
> know for sure whether we'll get back all of our data in uFrame "x" or
> whether we'll only get part of the data and need uFrame "x + 1".
>
> C) It's possible to schedule 100us worth of periodic transfers in one
> 125us uFrame.
>
> D) We can't know the result of a transfer until that transfer is done.
>
>
> So above basically means that we might have a periodic transfer where
> we get the result of the transfer 100us into a uFrame.  We've now got
> to quickly queue up the transfer for the next uFrame.  If hardware was
> designed in the "insane" way then we'd need an interrupt latency of <
> 25 us since once the frame ticked we'd no longer be able to schedule.
> If hardware was designed in the "sane" way then we'd "only" need an
> interrupt latency of 125 us since we could continue to schedule even
> partway through the current frame.
>
> Also note that if there's any chance that a periodic transfer ends
> later than 100 us into a frame (like if a non-periodic transfer snuck
> in there because we were out of periodic channels) then the above
> problem becomes even more extreme.
>
>
>
> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index a5db20f12ee4..c143f26bd9d9 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1703,9 +1703,97 @@  static void dwc2_hc_set_even_odd_frame(struct dwc2_hsotg *hsotg,
 {
 	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 	    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* 1 if _next_ frame is odd, 0 if it's even */
-		if (!(dwc2_hcd_get_frame_number(hsotg) & 0x1))
+		int host_speed;
+		int xfer_ns;
+		int xfer_us;
+		int bytes_in_fifo;
+		u16 fifo_space;
+		u16 frame_number;
+		u16 wire_frame;
+
+		/*
+		 * Try to figure out if we're an even or odd frame. If we set
+		 * even and the current frame number is even the the transfer
+		 * will happen immediately.  Similar if both are odd. If one is
+		 * even and the other is odd then the transfer will happen when
+		 * the frame number ticks.
+		 *
+		 * There's a bit of a balancing act to get this right.
+		 * Sometimes we may want to send data in the current frame (AK
+		 * right away).  We might want to do this if the frame number
+		 * _just_ ticked, but we might also want to do this in order
+		 * to continue a split transaction that happened late in a
+		 * microframe (so we didn't know to queue the next transfer
+		 * until the frame number had ticked).  The problem is that we
+		 * need a lot of knowledge to know if there's actually still
+		 * time to send things or if it would be better to wait until
+		 * the next frame.
+		 *
+		 * We can look at how much time is left in the current frame
+		 * and make a guess about whether we'll have time to transfer.
+		 * We'll do that.
+		 */
+
+		/* Get speed host is running at */
+		host_speed = (chan->speed != USB_SPEED_HIGH &&
+			      !chan->do_split) ? chan->speed : USB_SPEED_HIGH;
+
+		/* See how many bytes are in the periodic FIFO right now */
+		fifo_space = (dwc2_readl(hsotg->regs + HPTXSTS) &
+			      TXSTS_FSPCAVAIL_MASK) >> TXSTS_FSPCAVAIL_SHIFT;
+		bytes_in_fifo = sizeof(u32) *
+				(hsotg->core_params->host_perio_tx_fifo_size -
+				 fifo_space);
+
+		/*
+		 * Roughly estimate bus time for everything in the periodic
+		 * queue + our new transfer.  This is "rough" because we're
+		 * using a function that makes takes into account IN/OUT
+		 * and INT/ISO and we're just slamming in one value for all
+		 * transfers.  This should be an over-estimate and that should
+		 * be OK, but we can probably tighten it.
+		 */
+		xfer_ns = usb_calc_bus_time(host_speed, false, false,
+					    chan->xfer_len + bytes_in_fifo);
+		xfer_us = NS_TO_US(xfer_ns);
+
+		/* See what frame number we'll be at by the time we finish */
+		frame_number = dwc2_hcd_get_future_frame_number(hsotg, xfer_us);
+
+		/* This is when we were scheduled to be on the wire */
+		wire_frame = dwc2_frame_num_inc(chan->qh->next_active_frame, 1);
+
+		/*
+		 * If we'd finish _after_ the frame we're scheduled in then
+		 * it's hopeless.  Just schedule right away and hope for the
+		 * best.  Note that it _might_ be wise to call back into the
+		 * scheduler to pick a better frame, but this is better than
+		 * nothing.
+		 */
+		if (dwc2_frame_num_gt(frame_number, wire_frame)) {
+			dwc2_sch_vdbg(hsotg,
+				      "QH=%p EO MISS fr=%04x=>%04x (%+d)\n",
+				      chan->qh, wire_frame, frame_number,
+				      dwc2_frame_num_dec(frame_number,
+							 wire_frame));
+			wire_frame = frame_number;
+
+			/*
+			 * We picked a different frame number; communicate this
+			 * back to the scheduler so it doesn't try to schedule
+			 * another in the same frame.
+			 *
+			 * Remember that next_active_frame is 1 before the wire
+			 * frame.
+			 */
+			chan->qh->next_active_frame =
+				dwc2_frame_num_dec(frame_number, 1);
+		}
+
+		if (wire_frame & 1)
 			*hcchar |= HCCHAR_ODDFRM;
+		else
+			*hcchar &= ~HCCHAR_ODDFRM;
 	}
 }
 
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 3abb34a5fc5b..5f909747b5a4 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -985,6 +985,14 @@  static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	 *   and next_active_frame are always 1 frame before we want things
 	 *   to be active and we assume we can still get scheduled in the
 	 *   current frame number.
+	 * - It's possible for start_active_frame (now incremented) to be
+	 *   next_active_frame if we got an EO MISS (even_odd miss) which
+	 *   basically means that we detected there wasn't enough time for
+	 *   the last packet and dwc2_hc_set_even_odd_frame() rescheduled us
+	 *   at the last second.  We want to make sure we don't schedule
+	 *   another transfer for the same frame.  My test webcam doesn't seem
+	 *   terribly upset by missing a transfer but really doesn't like when
+	 *   we do two transfers in the same frame.
 	 * - Some misses are expected.  Specifically, in order to work
 	 *   perfectly dwc2 really needs quite spectacular interrupt latency
 	 *   requirements.  It needs to be able to handle its interrupts
@@ -995,7 +1003,8 @@  static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	 *   guarantee that a system will have interrupt latency < 125 us, so
 	 *   we have to be robust to some misses.
 	 */
-	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
+	if (qh->start_active_frame == qh->next_active_frame ||
+	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
 		u16 ideal_start = qh->start_active_frame;
 
 		/* Adjust interval as per gcd with plan length. */

[v6,20/22] usb: dwc2: host: Properly set even/odd frame

Commit Message

Comments

Patch