diff mbox

USB: EHCI: fix for leaking isochronous data

Message ID 514B3DBB.3060302@web.de (mailing list archive)
State New, archived
Headers show

Commit Message

Soeren Moch March 21, 2013, 5:04 p.m. UTC
On 03/17/13 18:36, Alan Stern wrote:
> On Sun, 17 Mar 2013, Soeren Moch wrote:
>
>> For each device only one isochronous endpoint is used (EP IN4, 1x 940
>> Bytes, Interval 1).
>> When the ENOMEM error occurs, a huge number of iTDs is in the free_list
>> of one stream. This number is much higher than the 2*M entries, which
>> should be there according to your description.
>
> Okay, but how did they get there?  With each URB requiring 9 iTDs, and
> about 5 URBs active at any time, there should be about 5*9 = 45 iTDs in
> use and 2*9 = 18 iTDs on the free list.  By the time each URB
> completes, it should have released all 9 iTDs back to the free list,
> and each time an URB is submitted, it should be able to acquire all 9
> of the iTDs that it needs from the free list -- it shouldn't have to
> allocate any from the DMA pool.
>
> Looks like you'll have to investigate what's going on inside
> itd_urb_transaction().  Print out some useful information whenever the
> size of stream->free_list is above 50, such as the value of num_itds,
> how many of the loop iterations could get an iTD from the free list,
> and the value of itd->frame in the case where the "goto alloc_itd"
> statement is followed.
>
> It might be a good idea also to print out the size of the free list in
> itd_complete(), where it calls ehci_urb_done(), and include the value
> of ehci->now_frame.
>

Now I found out what is going on here:

In itd_urb_transaction() we allocate 9 iTDs for each URB with 
number_of_packets == 64 in my case. The iTDs are added to 
sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th 
one is released back to the front of the streams free_list in 
iso_sched_free(). This iTD was cleared after allocation and has a frame 
number of 0 now. So for each allocation when now_frame == 0 we allocate 
from the dma_pool, not from the free_list. The attached patch 
invalidates the frame number in each iTD before it is sent to the 
scheduler. This fixes the problem without the need to iterate over a iTD 
list.

Signed-off-by: Soeren Moch <smoch@web.de>

Comments

Jason Cooper March 21, 2013, 5:33 p.m. UTC | #1
On Thu, Mar 21, 2013 at 06:04:59PM +0100, Soeren Moch wrote:
> On 03/17/13 18:36, Alan Stern wrote:
> >On Sun, 17 Mar 2013, Soeren Moch wrote:
> >
> >>For each device only one isochronous endpoint is used (EP IN4, 1x 940
> >>Bytes, Interval 1).
> >>When the ENOMEM error occurs, a huge number of iTDs is in the free_list
> >>of one stream. This number is much higher than the 2*M entries, which
> >>should be there according to your description.
> >
> >Okay, but how did they get there?  With each URB requiring 9 iTDs, and
> >about 5 URBs active at any time, there should be about 5*9 = 45 iTDs in
> >use and 2*9 = 18 iTDs on the free list.  By the time each URB
> >completes, it should have released all 9 iTDs back to the free list,
> >and each time an URB is submitted, it should be able to acquire all 9
> >of the iTDs that it needs from the free list -- it shouldn't have to
> >allocate any from the DMA pool.
> >
> >Looks like you'll have to investigate what's going on inside
> >itd_urb_transaction().  Print out some useful information whenever the
> >size of stream->free_list is above 50, such as the value of num_itds,
> >how many of the loop iterations could get an iTD from the free list,
> >and the value of itd->frame in the case where the "goto alloc_itd"
> >statement is followed.
> >
> >It might be a good idea also to print out the size of the free list in
> >itd_complete(), where it calls ehci_urb_done(), and include the value
> >of ehci->now_frame.
> >
> 
> Now I found out what is going on here:
> 
> In itd_urb_transaction() we allocate 9 iTDs for each URB with
> number_of_packets == 64 in my case. The iTDs are added to
> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the
> 9th one is released back to the front of the streams free_list in
> iso_sched_free(). This iTD was cleared after allocation and has a
> frame number of 0 now. So for each allocation when now_frame == 0 we
> allocate from the dma_pool, not from the free_list. The attached
> patch invalidates the frame number in each iTD before it is sent to
> the scheduler. This fixes the problem without the need to iterate
> over a iTD list.
> 
> Signed-off-by: Soeren Moch <smoch@web.de>

Wow!  Great work Soeren!  Talk about a long road to a small fix.  Thanks
for keeping after it.

thx,

Jason.

> 
> 
> 

> --- linux-3.9.0-rc3-guru/drivers/usb/host/ehci-sched.c.orig	2013-03-21 17:36:21.000000000 +0100
> +++ linux-3.9.0-rc3-guru/drivers/usb/host/ehci-sched.c	2013-03-21 17:38:56.000000000 +0100
> @@ -1214,6 +1214,7 @@ itd_urb_transaction (
>  
>  		memset (itd, 0, sizeof *itd);
>  		itd->itd_dma = itd_dma;
> +		itd->frame = -1;
>  		list_add (&itd->itd_list, &sched->td_list);
>  	}
>  	spin_unlock_irqrestore (&ehci->lock, flags);
> @@ -1915,6 +1916,7 @@ sitd_urb_transaction (
>  
>  		memset (sitd, 0, sizeof *sitd);
>  		sitd->sitd_dma = sitd_dma;
> +		sitd->frame = -1;
>  		list_add (&sitd->sitd_list, &iso_sched->td_list);
>  	}
>
Arnd Bergmann March 21, 2013, 7:10 p.m. UTC | #2
On Thursday 21 March 2013, Jason Cooper wrote:
> On Thu, Mar 21, 2013 at 06:04:59PM +0100, Soeren Moch wrote:

> > 
> > Now I found out what is going on here:
> > 
> > In itd_urb_transaction() we allocate 9 iTDs for each URB with
> > number_of_packets == 64 in my case. The iTDs are added to
> > sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the
> > 9th one is released back to the front of the streams free_list in
> > iso_sched_free(). This iTD was cleared after allocation and has a
> > frame number of 0 now. So for each allocation when now_frame == 0 we
> > allocate from the dma_pool, not from the free_list. The attached
> > patch invalidates the frame number in each iTD before it is sent to
> > the scheduler. This fixes the problem without the need to iterate
> > over a iTD list.
> > 
> > Signed-off-by: Soeren Moch <smoch@web.de>
> 
> Wow!  Great work Soeren!  Talk about a long road to a small fix.  Thanks
> for keeping after it.

+1

I hardly understand half of the description above, but that much sounds
plausible. Is this a bug fix that should get backported to stable kernels?

	Arnd
Michael Nazzareno Trimarchi March 21, 2013, 7:34 p.m. UTC | #3
Hi

On 21/03/13 20:10, Arnd Bergmann wrote:
> On Thursday 21 March 2013, Jason Cooper wrote:
>> On Thu, Mar 21, 2013 at 06:04:59PM +0100, Soeren Moch wrote:
> 
>>>
>>> Now I found out what is going on here:
>>>
>>> In itd_urb_transaction() we allocate 9 iTDs for each URB with
>>> number_of_packets == 64 in my case. The iTDs are added to
>>> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the
>>> 9th one is released back to the front of the streams free_list in
>>> iso_sched_free(). This iTD was cleared after allocation and has a
>>> frame number of 0 now. So for each allocation when now_frame == 0 we
>>> allocate from the dma_pool, not from the free_list. The attached
>>> patch invalidates the frame number in each iTD before it is sent to
>>> the scheduler. This fixes the problem without the need to iterate
>>> over a iTD list.
>>>
>>> Signed-off-by: Soeren Moch <smoch@web.de>
>>
>> Wow!  Great work Soeren!  Talk about a long road to a small fix.  Thanks
>> for keeping after it.
> 
> +1
> 
> I hardly understand half of the description above, but that much sounds
> plausible. Is this a bug fix that should get backported to stable kernels?
> 

+ 1
and I can test on my device. Just one comment:
I don't know if -1 is the correct way to init it.

Michael

> 	Arnd
>
Alan Stern March 21, 2013, 9:06 p.m. UTC | #4
On Thu, 21 Mar 2013, Soeren Moch wrote:

> Now I found out what is going on here:
> 
> In itd_urb_transaction() we allocate 9 iTDs for each URB with 
> number_of_packets == 64 in my case. The iTDs are added to 
> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th 
> one is released back to the front of the streams free_list in 
> iso_sched_free(). This iTD was cleared after allocation and has a frame 
> number of 0 now. So for each allocation when now_frame == 0 we allocate 
> from the dma_pool, not from the free_list.

Okay, that is a problem.  But it shouldn't be such a big problem,
because now_frame should not be equal to 0 very often.

>  The attached patch 
> invalidates the frame number in each iTD before it is sent to the 
> scheduler. This fixes the problem without the need to iterate over a iTD 
> list.

The patch looks okay.  However I would like to understand why the 0 
frame value messes things up so much.

Alan Stern
Alan Stern March 21, 2013, 9:12 p.m. UTC | #5
On Thu, 21 Mar 2013, Alan Stern wrote:

> On Thu, 21 Mar 2013, Soeren Moch wrote:
> 
> > Now I found out what is going on here:
> > 
> > In itd_urb_transaction() we allocate 9 iTDs for each URB with 
> > number_of_packets == 64 in my case. The iTDs are added to 
> > sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th 
> > one is released back to the front of the streams free_list in 
> > iso_sched_free(). This iTD was cleared after allocation and has a frame 
> > number of 0 now. So for each allocation when now_frame == 0 we allocate 
> > from the dma_pool, not from the free_list.
> 
> Okay, that is a problem.  But it shouldn't be such a big problem,
> because now_frame should not be equal to 0 very often.

Oh, wait, now I get it.  We never reach a steady state, because the
free list never shrinks, but occasionally it does increase when
now_frame is equal to 0.  Even though that doesn't happen very often,
the effects add up.

Very good; tomorrow I will send your patch in.

Alan Stern
Andrew Lunn March 21, 2013, 9:20 p.m. UTC | #6
On Thu, Mar 21, 2013 at 05:12:01PM -0400, Alan Stern wrote:
> On Thu, 21 Mar 2013, Alan Stern wrote:
> 
> > On Thu, 21 Mar 2013, Soeren Moch wrote:
> > 
> > > Now I found out what is going on here:
> > > 
> > > In itd_urb_transaction() we allocate 9 iTDs for each URB with 
> > > number_of_packets == 64 in my case. The iTDs are added to 
> > > sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th 
> > > one is released back to the front of the streams free_list in 
> > > iso_sched_free(). This iTD was cleared after allocation and has a frame 
> > > number of 0 now. So for each allocation when now_frame == 0 we allocate 
> > > from the dma_pool, not from the free_list.
> > 
> > Okay, that is a problem.  But it shouldn't be such a big problem,
> > because now_frame should not be equal to 0 very often.
> 
> Oh, wait, now I get it.  We never reach a steady state, because the
> free list never shrinks, but occasionally it does increase when
> now_frame is equal to 0.  Even though that doesn't happen very often,
> the effects add up.
> 
> Very good; tomorrow I will send your patch in.

Hi Alan, Soeren

Could you word the description a bit better. If Alan did not get it
without a bit of thought, few others are going to understand it
without a better explanation.

Thanks
	Andrew
Soeren Moch March 21, 2013, 9:45 p.m. UTC | #7
On 03/21/13 22:12, Alan Stern wrote:
> On Thu, 21 Mar 2013, Alan Stern wrote:
>
>> On Thu, 21 Mar 2013, Soeren Moch wrote:
>>
>>> Now I found out what is going on here:
>>>
>>> In itd_urb_transaction() we allocate 9 iTDs for each URB with
>>> number_of_packets == 64 in my case. The iTDs are added to
>>> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th
>>> one is released back to the front of the streams free_list in
>>> iso_sched_free(). This iTD was cleared after allocation and has a frame
>>> number of 0 now. So for each allocation when now_frame == 0 we allocate
>>> from the dma_pool, not from the free_list.
>>
>> Okay, that is a problem.  But it shouldn't be such a big problem,
>> because now_frame should not be equal to 0 very often.
>
> Oh, wait, now I get it.  We never reach a steady state, because the
> free list never shrinks, but occasionally it does increase when
> now_frame is equal to 0.  Even though that doesn't happen very often,
> the effects add up.

Correct. Approximately twice a second we allocate 9 iTDs from the 
dma_pool in my case (the frame numbers run from 0 to 511 - not up to 
2047, which I thought they should). But since we allocate iTDs every 8 
frames in my case (for each active stream), the pool allocations occur 
only when we hit the allocation sequence frame 0,8,16,... (not one of 
the other 7 opportunities frame 1,9,17,... - frame 7,15,23,...)

> Very good; tomorrow I will send your patch in.
>
> Alan Stern
>
Thank you. And also thanks to all who helped to track down this issue.

  Soeren
Soeren Moch March 21, 2013, 9:52 p.m. UTC | #8
On 21.03.2013 20:10, Arnd Bergmann wrote:
> On Thursday 21 March 2013, Jason Cooper wrote:
>> On Thu, Mar 21, 2013 at 06:04:59PM +0100, Soeren Moch wrote:
>
>>>
>>> Now I found out what is going on here:
>>>
>>> In itd_urb_transaction() we allocate 9 iTDs for each URB with
>>> number_of_packets == 64 in my case. The iTDs are added to
>>> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the
>>> 9th one is released back to the front of the streams free_list in
>>> iso_sched_free(). This iTD was cleared after allocation and has a
>>> frame number of 0 now. So for each allocation when now_frame == 0 we
>>> allocate from the dma_pool, not from the free_list. The attached
>>> patch invalidates the frame number in each iTD before it is sent to
>>> the scheduler. This fixes the problem without the need to iterate
>>> over a iTD list.
>>>
>>> Signed-off-by: Soeren Moch <smoch@web.de>
>>
>> Wow!  Great work Soeren!  Talk about a long road to a small fix.  Thanks
>> for keeping after it.
>
> +1
>
> I hardly understand half of the description above, but that much sounds
> plausible. Is this a bug fix that should get backported to stable kernels?
>
> 	Arnd
>
I think this patch should go to stable kernels, too. But I'm not an 
expert here...

   Soeren
Soeren Moch March 21, 2013, 10:16 p.m. UTC | #9
On 21.03.2013 22:20, Andrew Lunn wrote:
> On Thu, Mar 21, 2013 at 05:12:01PM -0400, Alan Stern wrote:
>> On Thu, 21 Mar 2013, Alan Stern wrote:
>>
>>> On Thu, 21 Mar 2013, Soeren Moch wrote:
>>>
>>>> Now I found out what is going on here:
>>>>
>>>> In itd_urb_transaction() we allocate 9 iTDs for each URB with
>>>> number_of_packets == 64 in my case. The iTDs are added to
>>>> sched->td_list. For a frame-aligned scheduling we need 8 iTDs, the 9th
>>>> one is released back to the front of the streams free_list in
>>>> iso_sched_free(). This iTD was cleared after allocation and has a frame
>>>> number of 0 now. So for each allocation when now_frame == 0 we allocate
>>>> from the dma_pool, not from the free_list.
>>>
>>> Okay, that is a problem.  But it shouldn't be such a big problem,
>>> because now_frame should not be equal to 0 very often.
>>
>> Oh, wait, now I get it.  We never reach a steady state, because the
>> free list never shrinks, but occasionally it does increase when
>> now_frame is equal to 0.  Even though that doesn't happen very often,
>> the effects add up.
>>
>> Very good; tomorrow I will send your patch in.
>
> Hi Alan, Soeren
>
> Could you word the description a bit better. If Alan did not get it
> without a bit of thought, few others are going to understand it
> without a better explanation.
>
> Thanks
> 	Andrew
>

Alan,

can you come up with a better explanation, please? I think your 
description how it is supposed to work from here
    http://marc.info/?l=linux-usb&m=136345559432055&w=2
is required to understand the problem and the fix.

Thanks,
   Soeren
Alan Stern March 22, 2013, 2:24 p.m. UTC | #10
On Thu, 21 Mar 2013, Soeren Moch wrote:

> > Hi Alan, Soeren
> >
> > Could you word the description a bit better. If Alan did not get it
> > without a bit of thought, few others are going to understand it
> > without a better explanation.
> >
> > Thanks
> > 	Andrew
> >
> 
> Alan,
> 
> can you come up with a better explanation, please? I think your 
> description how it is supposed to work from here
>     http://marc.info/?l=linux-usb&m=136345559432055&w=2
> is required to understand the problem and the fix.

Okay, I will rewrite your patch description.

Alan Stern
diff mbox

Patch

--- linux-3.9.0-rc3-guru/drivers/usb/host/ehci-sched.c.orig	2013-03-21 17:36:21.000000000 +0100
+++ linux-3.9.0-rc3-guru/drivers/usb/host/ehci-sched.c	2013-03-21 17:38:56.000000000 +0100
@@ -1214,6 +1214,7 @@  itd_urb_transaction (
 
 		memset (itd, 0, sizeof *itd);
 		itd->itd_dma = itd_dma;
+		itd->frame = -1;
 		list_add (&itd->itd_list, &sched->td_list);
 	}
 	spin_unlock_irqrestore (&ehci->lock, flags);
@@ -1915,6 +1916,7 @@  sitd_urb_transaction (
 
 		memset (sitd, 0, sizeof *sitd);
 		sitd->sitd_dma = sitd_dma;
+		sitd->frame = -1;
 		list_add (&sitd->sitd_list, &iso_sched->td_list);
 	}