Message ID | 20200122222645.38805-3-john.stultz@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Avoiding DWC3 transfer stalls/hangs when using adb over f_fs | expand |
Hi, John Stultz <john.stultz@linaro.org> writes: > From: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> > > As a process of preparing TRBs usb_gadget_map_request_by_dev() is > called from dwc3_prepare_trbs() for mapping the request. This will > call dma_map_sg() if req->num_sgs are greater than 0. dma_map_sg() > will map the sg entries in sglist and return the number of mapped SGs. > As a part of mapping, some sg entries having contigous memory may be > merged together into a single sg (when IOMMU used). So, the number of > mapped sg entries may not be equal to the number of orginal sg entries > in the request (req->num_sgs). > > As a part of preparing the TRBs, dwc3_prepare_one_trb_sg() iterates over > the sg entries present in the sglist and calls sg_is_last() to identify > whether the sg entry is last and set IOC bit for the last sg entry. The > sg_is_last() determines last sg if SG_END is set in sg->page_link. When > IOMMU used, dma_map_sg() merges 2 or more sgs into a single sg and it > doesn't retain the page_link properties. Because of this reason the > sg_is_last() may not find SG_END and thus resulting in IOC bit never > getting set. > > For example: > > Consider a request having 8 sg entries with each entry having a length of > 4096 bytes. Assume that sg1 & sg2, sg3 & sg4, sg5 & sg6, sg7 & sg8 are > having contigous memory regions. > > Before calling dma_map_sg(): > sg1-->sg2-->sg3-->sg4-->sg6-->sg7-->sg8 > dma_length: 4K 4K 4K 4K 4K 4K 4K > SG_END: False False False False False False True > num_sgs = 8 > num_mapped_sgs = 0 > > The dma_map_sg() merges sg1 & sg2 memory regions into sg1->dma_address. > Similarly sg3 & sg4 into sg2->dma_address, sg5 & sg6 into the > sg3->dma_address and sg6 & sg8 into sg4->dma_address. Here the memory > regions are merged but the page_link properties like SG_END are not > retained into the merged sgs. > > After calling dma_map_sg(); > sg1-->sg2-->sg3-->sg4-->sg6-->sg7-->sg8 > dma_length: 8K 8K 8K 8K 0K 0K 0K > SG_END: False False False False False False True > num_sgs = 8 > num_mapped_sgs = 4 > > After calling dma_map_sg(), sg1,sg2,sg3,sg4 are having dma_length of > 8096 bytes each and remaining sg4,sg5,sg6,sg7 are having 0 bytes of > dma_length. > > After dma_map_sg() is performed dma_perpare_trb_sg() iterates on all sg > entries and sets IOC bit only for the sg8 (since sg_is_last() returns true > only for sg8). But after calling dma_map_sg() the valid data are present > only till sg4 and the IOC bit should be set for sg4 TRB only (which is not > happening in the present code) > > The above mentioned issue can be fixed by determining last sg based on the > req->num_queued_sgs instead of sg_is_last(). If (req->num_queued_sgs + 1) > is equal to req->num_mapped_sgs, then this sg is the last sg. In the above > example, the dwc3 driver has already queued 3 sgs (upto sg3), so the > num_queued_sgs = 3. On preparing the next sg (i.e sg4), check for last sg > (num_queued_sgs + 1) == num_mapped_sgs becomes true. So, the driver sets > IOC bit for sg4. This patch does the same. > > At a practical level, this patch resolves USB transfer stalls > seen with adb on dwc3 based db845c, pixel3 and other qcom > hardware after functionfs gadget added scatter-gather support > around v4.20. > > Cc: Felipe Balbi <felipe.balbi@linux.intel.com> > Cc: Yang Fei <fei.yang@intel.com> > Cc: Thinh Nguyen <thinhn@synopsys.com> > Cc: Tejas Joglekar <tejas.joglekar@synopsys.com> > Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com> > Cc: Jack Pham <jackp@codeaurora.org> > Cc: Todd Kjos <tkjos@google.com> > Cc: Greg KH <gregkh@linuxfoundation.org> > Cc: Linux USB List <linux-usb@vger.kernel.org> > Cc: stable <stable@vger.kernel.org> > Signed-off-by: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> > [jstultz: Add note to end of commit message on specific issue this resovles] > Signed-off-by: John Stultz <john.stultz@linaro.org> > --- > drivers/usb/dwc3/gadget.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 1edce3bbb55c..30a80bc97cfe 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -1068,7 +1068,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep, > unsigned int rem = length % maxp; > unsigned chain = true; > > - if (sg_is_last(s)) > + if ((req->num_queued_sgs + 1) == req->request.num_mapped_sgs) This is probably a bug on DMA API. If it combines pages from scatter-list, then it should also move the last SG so sg_is_last() continues to work. I had asked author to discuss this with DMA API maintainers. Can you do that?
Hi Felipe, >-----Original Message----- >From: Felipe Balbi <balbif@gmail.com> On Behalf Of Felipe Balbi >Sent: Thursday, January 23, 2020 12:56 PM >To: John Stultz <john.stultz@linaro.org>; lkml <linux-kernel@vger.kernel.org> >Cc: Anurag Kumar Vulisha <anuragku@xilinx.com>; Felipe Balbi ><felipe.balbi@linux.intel.com>; Yang Fei <fei.yang@intel.com>; Thinh Nguyen ><thinhn@synopsys.com>; Tejas Joglekar <tejas.joglekar@synopsys.com>; >Andrzej Pietrasiewicz <andrzej.p@collabora.com>; Jack Pham ><jackp@codeaurora.org>; Todd Kjos <tkjos@google.com>; Greg KH ><gregkh@linuxfoundation.org>; Linux USB List <linux-usb@vger.kernel.org>; >stable <stable@vger.kernel.org>; John Stultz <john.stultz@linaro.org> >Subject: Re: [RFC][PATCH 2/2] usb: dwc3: gadget: Correct the logic for finding >last SG entry > > >Hi, > >John Stultz <john.stultz@linaro.org> writes: >> From: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> >> >> As a process of preparing TRBs usb_gadget_map_request_by_dev() is >> called from dwc3_prepare_trbs() for mapping the request. This will >> call dma_map_sg() if req->num_sgs are greater than 0. dma_map_sg() >> will map the sg entries in sglist and return the number of mapped SGs. >> As a part of mapping, some sg entries having contigous memory may be >> merged together into a single sg (when IOMMU used). So, the number of >> mapped sg entries may not be equal to the number of orginal sg entries >> in the request (req->num_sgs). >> >> As a part of preparing the TRBs, dwc3_prepare_one_trb_sg() iterates >> over the sg entries present in the sglist and calls sg_is_last() to >> identify whether the sg entry is last and set IOC bit for the last sg >> entry. The >> sg_is_last() determines last sg if SG_END is set in sg->page_link. >> When IOMMU used, dma_map_sg() merges 2 or more sgs into a single sg >> and it doesn't retain the page_link properties. Because of this reason >> the >> sg_is_last() may not find SG_END and thus resulting in IOC bit never >> getting set. >> >> For example: >> >> Consider a request having 8 sg entries with each entry having a length >> of >> 4096 bytes. Assume that sg1 & sg2, sg3 & sg4, sg5 & sg6, sg7 & sg8 are >> having contigous memory regions. >> >> Before calling dma_map_sg(): >> sg1-->sg2-->sg3-->sg4-->sg6-->sg7-->sg8 >> dma_length: 4K 4K 4K 4K 4K 4K 4K >> SG_END: False False False False False False True >> num_sgs = 8 >> num_mapped_sgs = 0 >> >> The dma_map_sg() merges sg1 & sg2 memory regions into sg1->dma_address. >> Similarly sg3 & sg4 into sg2->dma_address, sg5 & sg6 into the >> sg3->dma_address and sg6 & sg8 into sg4->dma_address. Here the memory >> regions are merged but the page_link properties like SG_END are not >> retained into the merged sgs. >> >> After calling dma_map_sg(); >> sg1-->sg2-->sg3-->sg4-->sg6-->sg7-->sg8 >> dma_length: 8K 8K 8K 8K 0K 0K 0K >> SG_END: False False False False False False True >> num_sgs = 8 >> num_mapped_sgs = 4 >> >> After calling dma_map_sg(), sg1,sg2,sg3,sg4 are having dma_length of >> 8096 bytes each and remaining sg4,sg5,sg6,sg7 are having 0 bytes of >> dma_length. >> >> After dma_map_sg() is performed dma_perpare_trb_sg() iterates on all >> sg entries and sets IOC bit only for the sg8 (since sg_is_last() >> returns true only for sg8). But after calling dma_map_sg() the valid >> data are present only till sg4 and the IOC bit should be set for sg4 >> TRB only (which is not happening in the present code) >> >> The above mentioned issue can be fixed by determining last sg based on >> the >> req->num_queued_sgs instead of sg_is_last(). If (req->num_queued_sgs + >> req->1) >> is equal to req->num_mapped_sgs, then this sg is the last sg. In the >> above example, the dwc3 driver has already queued 3 sgs (upto sg3), so >> the num_queued_sgs = 3. On preparing the next sg (i.e sg4), check for >> last sg (num_queued_sgs + 1) == num_mapped_sgs becomes true. So, the >> driver sets IOC bit for sg4. This patch does the same. >> >> At a practical level, this patch resolves USB transfer stalls seen >> with adb on dwc3 based db845c, pixel3 and other qcom hardware after >> functionfs gadget added scatter-gather support around v4.20. >> >> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> >> Cc: Yang Fei <fei.yang@intel.com> >> Cc: Thinh Nguyen <thinhn@synopsys.com> >> Cc: Tejas Joglekar <tejas.joglekar@synopsys.com> >> Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com> >> Cc: Jack Pham <jackp@codeaurora.org> >> Cc: Todd Kjos <tkjos@google.com> >> Cc: Greg KH <gregkh@linuxfoundation.org> >> Cc: Linux USB List <linux-usb@vger.kernel.org> >> Cc: stable <stable@vger.kernel.org> >> Signed-off-by: Anurag Kumar Vulisha <anurag.kumar.vulisha@xilinx.com> >> [jstultz: Add note to end of commit message on specific issue this >> resovles] >> Signed-off-by: John Stultz <john.stultz@linaro.org> >> --- >> drivers/usb/dwc3/gadget.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index 1edce3bbb55c..30a80bc97cfe 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -1068,7 +1068,7 @@ static void dwc3_prepare_one_trb_sg(struct >dwc3_ep *dep, >> unsigned int rem = length % maxp; >> unsigned chain = true; >> >> - if (sg_is_last(s)) >> + if ((req->num_queued_sgs + 1) == req- >>request.num_mapped_sgs) > >This is probably a bug on DMA API. If it combines pages from scatter-list, then it >should also move the last SG so sg_is_last() continues to work. > >I had asked author to discuss this with DMA API maintainers. Can you do that? > I was stuck with other tasks, so couldn't discuss with DMA maintainers on this. I am sorry for that. Hi John, Thanks for bringing this patch back . Please let me know if I can help you with anything on this. If you want, I am ready to start working on this. Thanks, Anurag Kumar Vulisha
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 1edce3bbb55c..30a80bc97cfe 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -1068,7 +1068,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep, unsigned int rem = length % maxp; unsigned chain = true; - if (sg_is_last(s)) + if ((req->num_queued_sgs + 1) == req->request.num_mapped_sgs) chain = false; if (rem && usb_endpoint_dir_out(dep->endpoint.desc) && !chain) {