From patchwork Tue Apr 9 05:45:43 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: CF Adad X-Patchwork-Id: 2413051 Return-Path: X-Original-To: patchwork-linux-omap@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id F2770DF25A for ; Tue, 9 Apr 2013 05:45:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760753Ab3DIFpq (ORCPT ); Tue, 9 Apr 2013 01:45:46 -0400 Received: from nm17.bullet.mail.bf1.yahoo.com ([98.139.212.176]:29407 "HELO nm17.bullet.mail.bf1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752931Ab3DIFpp convert rfc822-to-8bit (ORCPT ); Tue, 9 Apr 2013 01:45:45 -0400 Received: from [98.139.212.146] by nm17.bullet.mail.bf1.yahoo.com with NNFMP; 09 Apr 2013 05:45:44 -0000 Received: from [98.139.212.248] by tm3.bullet.mail.bf1.yahoo.com with NNFMP; 09 Apr 2013 05:45:44 -0000 Received: from [127.0.0.1] by omp1057.mail.bf1.yahoo.com with NNFMP; 09 Apr 2013 05:45:44 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 234368.67062.bm@omp1057.mail.bf1.yahoo.com Received: (qmail 5465 invoked by uid 60001); 9 Apr 2013 05:45:44 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rocketmail.com; s=s1024; t=1365486344; bh=jvC5vFI2/oSTMQYpPjuahAKJ9biRzJRJKlI5pppJBpo=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=VIGQKnTb9ByoI9hJ0KZa/WHFYtF9Z+yB6V9HBKXNAp+hzygb75dZp+llrbbPwRlvRMsnaCNZbKeKjdJVrqjwezxqf643vICFH37KrcrwSm+i1uuKXqrV4ffB5stNEF95W5fWQnv3oHcxpewWXR/QKL+lm/HxPTBV5HcR447KqQ8= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=rocketmail.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=6GvLAgh31OCW8Suqy2a5Lx2ZIWc7YIITl+r0PoJjZNJi4uH77AFvGyBlu6fbwaeELOM0WwhVc25qkjb+89rV6NWljCD3ME4/6rzt2+WPRpG3wxGneOLsi9vEttdaF8eI6STMB9aTzrvN3GV9qVD/Qp7XlKiNu1TReRK/WqaYfBM=; X-YMail-OSG: ..eY4q0VM1lxyN.WW8qK9coZpfKpCLaF941Y5Rpm0Gd.N7V .M7BgqrAZpW4aUdPgAWkx17RtF.B6C_4V2h90VBnQbKVdfaFfRxoiCwdwxyt BSBzke2nQum5jKhKoM.m_ikf9jFKm8BohSddfLel_F8fAW0r8f3AVw4M.gQr rz4eCZT9FfiY8zLqAAIkYcOZJO9hTkL_pXxhtB.Z6eXObDWj_sHV1JbeJP.F AvHtbjtoxCMcdoAgx655_ihlux2bI_PiNt_EpncEOqDCjVJWSy1YDjTlIjcU DSwYDzdav8ol18j0JuDJJJeCzIbpeQuywxbLnUsWFb9WQ6UhybL3yfRcez_G ibvV9vcyZmbNXrN0RD3GxYAh4.xv8JG1JbfOy.Q6juAzN95ezD43MKXw648n ZIh9gpoEphAW3vWI5n..9ibed.jDuimU3sB8DTix__8i.ftiY4QeBYw90Px4 mdbxio7powgH0w_F9rYfnkFalToOZFQPcS.Lp Received: from [108.51.70.26] by web160506.mail.bf1.yahoo.com via HTTP; Mon, 08 Apr 2013 22:45:43 PDT X-Rocket-MIMEInfo: 002.001, TXVndW50aGFuIGFuZCBhbGwsCgpXZSBoYXZlIGVzc2VudGlhbGx5IHByb3ZlbiBvdXIgdGhlb3J5LCBidXQgaGF2ZSB5ZXQgdG8gc29sdmUgdGhlIHByb2JsZW0uwqAgT3VyIHRoZW9yeSB3YXMgdGhhdCB0aGUgZHJpdmVyIHdhcyBlc3NlbnRpYWxseSB0YWtpbmcgYSBmZXcgc2Vjb25kIG5hcCBkdXJpbmcgdHJhbnNtaXNzaW9uLCBkZXNwaXRlIHRoZXJlIGJlaW5nIHBsZW50eSBtb3JlIHdvcmsgdG8gZG8uwqAgV2UgZnVydGhlciB0aGVvcml6ZWQgdGhhdCBvdXIgY3VzdG9tIHVzZXJzcGFjZSBhcHBsaWNhdGkBMAEBAQE- X-Mailer: YahooMailWebService/0.8.140.532 References: <1365042051.84279.YahooMailNeo@web160501.mail.bf1.yahoo.com> <515D3A86.6000504@ti.com> <1365085105.58314.YahooMailNeo@web160505.mail.bf1.yahoo.com> <1365105688.76437.YahooMailNeo@web160502.mail.bf1.yahoo.com> Message-ID: <1365486343.94427.YahooMailNeo@web160506.mail.bf1.yahoo.com> Date: Mon, 8 Apr 2013 22:45:43 -0700 (PDT) From: CF Adad Reply-To: CF Adad Subject: Re: AM3517 DaVinci EMAC Ethernet performance issues To: Mugunthan V N Cc: "linux-omap@vger.kernel.org" , "davem@davemloft.net" In-Reply-To: <1365105688.76437.YahooMailNeo@web160502.mail.bf1.yahoo.com> MIME-Version: 1.0 Sender: linux-omap-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-omap@vger.kernel.org Mugunthan and all, We have essentially proven our theory, but have yet to solve the problem.  Our theory was that the driver was essentially taking a few second nap during transmission, despite there being plenty more work to do.  We further theorized that our custom userspace application was somehow preventing the driver from taking this nap, and that's why our RX performance is perfect all the time while our TX is so spotty unless that application is running. Over the last few days, we've went through several test patches and have essentially tracked the issue back to our emac_poll() function not being called.  By inserting the following patch snippet, and snippets like it, into various functions in the emac driver, we were able to prove that from time to time these functions sleep for several seconds, despite iperf certainly having given them something to do.  (Note: the patch has changed a bit since the output pasted below was captured, as we were looking for more detail, but the core functionality is the same.) =================================================================================================== [[ ---- patch snippet ---- ]] =================================================================================================== It appears from our debugging that somehow the emac_poll() function is not being called despite there being work to do.  As before however, when we start our custom application (which presently does very little from a system calls' perspective), our performance stabilizes greatly.  Please see below. Has anyone else seen anything like this with the EMAC, or have any idea why or how this could be happening?  We initially thought it could just be a failure to properly reschedule the emac_wait(), but after playing with various items there for a while, we're convinced that's not it.  Now we're guessing that perhaps softirqs or something within the DMA world is to blame here, but we're certainly not experts in this area.  Again, we've passed in 'nohlt' and have worked to remove power management.  So we don't think that plays a role.  Likewise, we never see a packet error -- even in our Wireshark traces.  We just see the system take a split-second to multi-second break. What specifically causes the emac_poll() and emac_tx_handler() functions to fire, and is there a way to debug that cleanly? Can anyone imagine how a userspace application would be able to improve this behavior so much? Again, thank you so very much for your time. CF Boot log: =================================================================================================== Fresh boot with no network activity, we see this with our debugging in there. =================================================================================================== [   22.815582] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   22.822937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [   23.817993] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   23.825378] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [   24.817810] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   24.825225] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [   30.793365] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   30.800811] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [   31.427093] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   31.434661] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [   32.426879] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   32.434387] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! root@test:~# iperf-i 5 -t 30000 -c 10.22.255.254 [   43.093078] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [   43.100463] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! =================================================================================================== Starting iperf with 4x instances of stuff running.  See?  It's very happy for a long time. =================================================================================================== ------------------------------------------------------------ Client connecting to 10.22.255.254, TCP port 5001 TCP window size: 19.6 KByte (default) ------------------------------------------------------------ [  3] local 10.22.0.2 port 44102 connected with 10.22.255.254 port 5001 [ ID] Interval       Transfer     Bandwidth [  3]  0.0- 5.0 sec  56.8 MBytes  95.3 Mbits/sec [  3]  5.0-10.0 sec  56.0 MBytes  94.0 Mbits/sec [[ ---- SNIP ---- ]] [  3] 615.0-620.0 sec  56.4 MBytes  94.6 Mbits/sec [  3] 620.0-625.0 sec  55.3 MBytes  92.9 Mbits/sec =================================================================================================== Stopping all 4 instances from running... ===================================================================================================          Stopping 1... [  OK  ] Stopped 1.          Stopping 2... [  OK  ] Stopped 2.          Stopping 3... [  OK  ] Stopped 3. [  3] 625.0-630.0 sec  55.1 MBytes  92.4 Mbits/sec          Stopping 4... [  OK  ] Stopped 4. =================================================================================================== Instantaneously, we see the POLL (and therefore the TX handler) warnings getting called and the performance is slipping and/or timing out all together. =================================================================================================== [  681.469207] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  681.476501] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  3] 630.0-635.0 sec  8.59 MBytes  14.4 Mbits/sec [  692.489929] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  692.497222] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  692.504272] ===EMAC: TX queue has been stopped for > 1/2 second!!! [  3] 635.0-640.0 sec  11.4 MBytes  19.1 Mbits/sec [  3] 640.0-645.0 sec  0.00 Bytes  0.00 bits/sec [  3] 645.0-650.0 sec  5.71 MBytes  9.58 Mbits/sec [  3] 650.0-655.0 sec  29.6 MBytes  49.6 Mbits/sec [  3] 655.0-660.0 sec  46.7 MBytes  78.3 Mbits/sec [  714.523315] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  714.530609] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  3] 660.0-665.0 sec  42.3 MBytes  71.0 Mbits/sec [  3] 665.0-670.0 sec  0.00 Bytes  0.00 bits/sec [  3] 670.0-675.0 sec  37.5 MBytes  62.9 Mbits/sec [  3] 675.0-680.0 sec  53.1 MBytes  89.1 Mbits/sec [  3] 680.0-685.0 sec  44.8 MBytes  75.2 Mbits/sec [  3] 685.0-690.0 sec  49.2 MBytes  82.6 Mbits/sec [  3] 690.0-695.0 sec  49.0 MBytes  82.2 Mbits/sec [  747.616729] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  747.624023] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  3] 695.0-700.0 sec  40.2 MBytes  67.4 Mbits/sec [  3] 700.0-705.0 sec  5.38 MBytes  9.03 Mbits/sec [  3] 705.0-710.0 sec  51.8 MBytes  86.9 Mbits/sec          Starting 2... [  OK  ] Started 2. [  3] 710.0-715.0 sec  51.9 MBytes  87.1 Mbits/sec [  3] 715.0-720.0 sec  54.1 MBytes  90.8 Mbits/sec [  3] 720.0-725.0 sec  53.6 MBytes  89.9 Mbits/sec [  3] 725.0-730.0 sec  53.3 MBytes  89.4 Mbits/sec [  3] 730.0-735.0 sec  53.2 MBytes  89.3 Mbits/sec [  3] 735.0-740.0 sec  56.1 MBytes  94.1 Mbits/sec [  3] 740.0-745.0 sec  52.0 MBytes  87.3 Mbits/sec [  3] 745.0-750.0 sec  52.5 MBytes  88.1 Mbits/sec [  3] 750.0-755.0 sec  55.9 MBytes  93.8 Mbits/sec [  3] 755.0-760.0 sec  55.9 MBytes  93.9 Mbits/sec [  3] 760.0-765.0 sec  53.3 MBytes  89.5 Mbits/sec ^C[  3]  0.0-765.9 sec  7.93 GBytes  88.9 Mbits/sec =================================================================================================== Shut down iperf and drop back to the default idling =================================================================================================== root@test:~# [  809.822906] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  809.960937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  810.822845] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  810.960937] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! [  813.143280] ===EMAC: POLL FUNCTION has not been called for > 1/2 second!!! [  813.355712] ===EMAC: TX FUNCTION has not been called for > 1/2 second!!! --- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html =================================================================================================== [[ ---- patch snippet ---- ]] =================================================================================================== --- linux/drivers/net/ethernet/ti/davinci_emac.c    2013-04-08 16:20:19.202550042 -0400 +++ linux_emac_tst/drivers/net/ethernet/ti/davinci_emac.c    2013-04-08 18:45:37.000000000 -0400 @@ -64,6 +64,12 @@    #include "davinci_cpdma.h"   +//// DEBUGGING ONLY!!!! +#include +static unsigned long qstopped_jifs = 0; +static unsigned long poll_func_jifs = 0; +static u32 last_num_rx_pkts = 0, last_num_tx_pkts = 0; +  static int debug_level;  module_param(debug_level, int, 0);  MODULE_PARM_DESC(debug_level, "DaVinci EMAC debug level (NETIF_MSG bits)"); @@ -1055,7 +1061,19 @@       * queue is stopped then start the queue as we have free desc for tx       */      if (unlikely(netif_queue_stopped(ndev))) +    {          netif_wake_queue(ndev); +        if (qstopped_jifs > 0) +        { +            // we've been stopped, measure how long +            if ( time_after_eq(jiffies, (qstopped_jifs + HZ/2)) ) +            { +                // we've been stopped >= 1/2 second!!! +                printk(KERN_ERR "===EMAC: TX queue has been stopped for > 1/2 second!!!\n"); +            } +        } +        qstopped_jifs = 0; +    }          ndev->stats.tx_packets++;      ndev->stats.tx_bytes += len;      dev_kfree_skb_any(skb); @@ -1105,7 +1123,11 @@       * tell the kernel to stop sending us tx frames.       */      if (unlikely(!cpdma_check_free_tx_desc(priv->txchan))) +    {          netif_stop_queue(ndev); +        // we just stopped, record the time +        qstopped_jifs = jiffies; +    }        return NETDEV_TX_OK;   @@ -1374,6 +1396,15 @@      u32 status = 0;      u32 num_tx_pkts = 0, num_rx_pkts = 0;   + +unsigned long tmp = jiffies; +if ( time_after_eq(tmp, (poll_func_jifs + HZ/2)) ) +{ +    // we've been stopped >= 1/2 second!!! +    printk(KERN_ERR "===EMAC: POLL FUNCTION, last RX/TX=[%d][%d]\n", last_num_rx_pkts, last_num_tx_pkts); +} +poll_func_jifs = tmp; +      /* Check interrupt vectors and call packet processing */      status = emac_read(EMAC_MACINVECTOR);   @@ -1385,6 +1416,10 @@      if (status & mask) {          num_tx_pkts = cpdma_chan_process(priv->txchan,                            EMAC_DEF_TX_MAX_SERVICE); +        if (num_tx_pkts < 0) +        { +printk(KERN_ERR "=== ERR!!! num_tx_pkts=%d\n", num_tx_pkts);         +        }      } /* TX processing */        mask = EMAC_DM644X_MAC_IN_VECTOR_RX_INT_VEC; @@ -1394,8 +1429,17 @@        if (status & mask) {          num_rx_pkts = cpdma_chan_process(priv->rxchan, budget); +        if (num_rx_pkts < 0) +        { +printk(KERN_ERR "=== ERR!!! num_rx_pkts=%d\n", num_rx_pkts); +        }      } /* RX processing */   + +last_num_tx_pkts = num_tx_pkts; +last_num_rx_pkts = num_rx_pkts; + +      mask = EMAC_DM644X_MAC_IN_VECTOR_HOST_INT;      if (priv->version == EMAC_VERSION_2)          mask = EMAC_DM646X_MAC_IN_VECTOR_HOST_INT;