diff mbox series

[1/2] usb: cdns3: improve handling of unaligned address case

Message ID 20230518204947.3770236-1-Frank.Li@nxp.com (mailing list archive)
State Accepted
Commit 2a1c4639d6d6bcee27f74e38f83ffb43579c4733
Headers show
Series [1/2] usb: cdns3: improve handling of unaligned address case | expand

Commit Message

Frank Li May 18, 2023, 8:49 p.m. UTC
When the address of a request was not aligned with an 8-byte boundary, the
USB DMA was unable to process it, necessitating the use of an internal
bounce buffer.

In these cases, the request->buf had to be copied to/from this bounce
buffer. However, if this unaligned address scenario arises, it is
unnecessary to perform heavy cache maintenance operations like
usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
does not utilize it at all. it can be skipped at this case.

iperf3 tests on the rndis case:

Transmit speed (TX): Improved from 299Mbps to 440Mbps
Receive speed (RX): Improved from 290Mbps to 500Mbps

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Comments

Peter Chen June 4, 2023, 11:12 p.m. UTC | #1
On 23-05-18 16:49:45, Frank Li wrote:
> When the address of a request was not aligned with an 8-byte boundary, the
> USB DMA was unable to process it, necessitating the use of an internal
> bounce buffer.
> 
> In these cases, the request->buf had to be copied to/from this bounce
> buffer. However, if this unaligned address scenario arises, it is
> unnecessary to perform heavy cache maintenance operations like
> usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
> does not utilize it at all. it can be skipped at this case.
> 
> iperf3 tests on the rndis case:
> 
> Transmit speed (TX): Improved from 299Mbps to 440Mbps
> Receive speed (RX): Improved from 290Mbps to 500Mbps
> 
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index 1dcadef933e3..09a0882a4e97 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -800,7 +800,8 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
>  	if (request->status == -EINPROGRESS)
>  		request->status = status;
>  
> -	usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
> +	if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
> +		usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
>  					priv_ep->dir);
>  
>  	if ((priv_req->flags & REQUEST_UNALIGNED) &&
> @@ -2543,10 +2544,12 @@ static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
>  	if (ret < 0)
>  		return ret;
>  
> -	ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
> +	if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
> +		ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
>  					    usb_endpoint_dir_in(ep->desc));

So, the possible reason for performance drop is	do cache coherency
operation twice for unaligned buffers?

Peter

> -	if (ret)
> -		return ret;
> +		if (ret)
> +			return ret;
> +	}
>  
>  	list_add_tail(&request->list, &priv_ep->deferred_req_list);
>  
> -- 
> 2.34.1
>
diff mbox series

Patch

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 1dcadef933e3..09a0882a4e97 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -800,7 +800,8 @@  void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
 	if (request->status == -EINPROGRESS)
 		request->status = status;
 
-	usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
+	if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
+		usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
 					priv_ep->dir);
 
 	if ((priv_req->flags & REQUEST_UNALIGNED) &&
@@ -2543,10 +2544,12 @@  static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
 	if (ret < 0)
 		return ret;
 
-	ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
+	if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
+		ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
 					    usb_endpoint_dir_in(ep->desc));
-	if (ret)
-		return ret;
+		if (ret)
+			return ret;
+	}
 
 	list_add_tail(&request->list, &priv_ep->deferred_req_list);