From patchwork Tue Apr 25 18:11:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Wahren X-Patchwork-Id: 9698899 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5610D6020A for ; Tue, 25 Apr 2017 18:12:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4420428459 for ; Tue, 25 Apr 2017 18:12:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 37D612862A; Tue, 25 Apr 2017 18:12:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 662A628459 for ; Tue, 25 Apr 2017 18:11:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Subject:References: In-Reply-To:Message-ID:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=5p7pjNhpaKFoDjlzKCBC1hdvAoYZflCW8Q0JfYU7sPI=; b=JzuP4wbtf0FKr5 t8CLacmPxM7pQa4AchGVzC3qOiAWAXuxsRLvW7XKToV5mDXspqpEQ0ONweytFBQ/3F+vlrSRsrDv7 fov+PrMtw+xO/DEcR3/uOdRWe+eD6OWrXpszRL5nC85uKTzIuZOp8Kry5Ak2/xKeuZtWT6aFGspim X4xjOMYvJscdtuf0YJR1aIUiIzY9qt9fc+G5fUnlCrn1dvsoSEIxFmR8/cRT294HYuZMNw88FcEaY ny/gE/8K6qoMbawGHmIqSdHGohMfnyug4tDkvUOBfSoLhbYX74kbGWAOcQ66a0C2gzm9tQWuZnOWT bc6QrpDQ1RqeALYnxhXA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1d34wQ-0000RL-TV; Tue, 25 Apr 2017 18:11:58 +0000 Received: from mout.kundenserver.de ([217.72.192.73]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1d34wM-0000QP-D3; Tue, 25 Apr 2017 18:11:57 +0000 Received: from oxbaltgw38.schlund.de ([172.19.246.46]) by mrelayeu.kundenserver.de (mreue104 [212.227.15.183]) with ESMTPSA (Nemesis) id 0M8zxV-1cqka20CWz-00CPLC; Tue, 25 Apr 2017 20:11:11 +0200 Date: Tue, 25 Apr 2017 20:11:09 +0200 (CEST) From: Stefan Wahren To: Eric Anholt , Doug Anderson Message-ID: <2127594073.298820.1493143869792@email.1und1.de> In-Reply-To: <1998517910.54108.1492894253010@email.1und1.de> References: <1795308650.27171.9a53158f-312d-40ce-80ce-8bf792d8db34.open-xchange@email.1und1.de> <172093673.40121.1492427140661@email.1und1.de> <79b9b35b-0600-771f-4cd2-9e03c5ba3a25@i2se.com> <186569458.91967.1492547106553@email.1und1.de> <212870399.174480.1492633502649@email.1und1.de> <87mvbaykn1.fsf@eliezer.anholt.net> <1998517910.54108.1492894253010@email.1und1.de> Subject: Re: usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s MIME-Version: 1.0 X-Priority: 3 Importance: Medium X-Mailer: Open-Xchange Mailer v7.8.1-Rev34 X-Originating-Client: open-xchange-appsuite X-Provags-ID: V03:K0:dpCrEjalI8espJerhDZceXCqm7HYj1LWyTdh13SDbsvFzwfkqS5 rBFA9h2HltxMOzNQLkm1T8g8IMGF1408QBVE9oOFRkjellFMhSlgzsPGx9K23xgzHjzyiTV 96JSAoo0Sc6k3D2Bj8ekA4bEs2n6iXCb09FTueTmd3HWHfj6QluNItdfQf6CetRoJ+Iqu7Q f5eVnVo57pCtdORllMj2A== X-UI-Out-Filterresults: notjunk:1; V01:K0:mVwMdYZIQm4=:VOJjIp+dp/ohJtn5Id4zMq cv2WnfYZkninXXNxYf29NpiiT6ZGPWybEElHVzF9lWc9kPkgVUNDvjPolHWlWfa7jsNtih7yB /nOPYmrYUOzWHK4KM182eC0yCbL5ApOknFTjacn2CeSg15b+Ro2W1o8OogMHAXXjACASGMjz2 IAEvKhrdraMAdmcRF/7p41wrS2MPF3GNEvj0Vnof7zc3JzfL8I4e5AMav4ML2qFQYw4bjmnPd pVy5BULyRr7GSHtA9BA/o9wAWjenNG7TxZtHfDKE24qhuTK3oUTqFXJOoHv5C6AxMr7JheQGl EcnFEmy4PNIdCXcrtlqUwlC5H1NzUejEOUXtumerdo8qVEgNXSguiyAlswTNu2eEMU2IHhX0N c6WjDEfcuKq7XDveg0SyetoMK3qnW4uo1suoJZKd/k6vkG5lZogw5Y1kVqOEyJ8CAciM6rGg5 vND52LlY0DEW91wc3BXiA4gabHbTvKwptJV1m7U+EI0FRfld/U/805qgP79tlD2xPndi8cxhI 4n3i6gVtO/YiUFjN8X/3dtTOp/g1QXpTo1+N0KEGFHUR7Ed5o+dlquVAHcd5NvS1ZA2cPHvCV 19ac1ggAEL1BB5XkZXQm/UMpYceQmF4PgkGf/l+Lc590bXnRORQmQX+fNUaNN0CWgNTeihSYK yje0Tfpi4iq22+0WObTMeKTMYyZyC8lEMSsAjkNTwVs6uZ1mQ1o44f7dGhVpsj4jsp/DK9hCm kS3FUaaqZdjZvFad X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170425_111154_789799_D5225932 X-CRM114-Status: GOOD ( 14.87 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?Q?Heiko_St=C3=BCbner?= , linux-usb@vger.kernel.org, Johan Hovold , John Youn , John Stultz , linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Heiner Kallweit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP > Stefan Wahren hat am 22. April 2017 um 22:50 geschrieben: > > > Hi, > > > Eric Anholt hat am 20. April 2017 um 20:54 geschrieben: > > > > > > Stefan Wahren writes: > > > > > Hi, > > > > > >> Doug Anderson hat am 18. April 2017 um 22:41 geschrieben: > > >> > > >> > > >> It's hard to know for sure that all of this time is really in > > >> urb_enqueue(). Possible we could have task switched out and been > > >> blocked elsewhere. Using ftrace to get more fine-grained timings > > >> would be useful. ktime_get(), ktime_sub(), and ktime_to_us() are your > > >> friends here if you want to use trace_printk. > > > > > > i'm a newbie to ftrace, so i hope this would be helpful. > > > > > > # connect PL2303 to the onboard hub > > > # echo 0 > options/sleep-time > > > # echo 0 > function_profile_enabled > > > # echo 1 > function_profile_enabled > > > # ./usb_test > > > # Waiting for at least 20 seconds and then disconnect PL2303 > > > # echo 0 > function_profile_enabled > > > # cat trace_stat/function0 > > > > > > Function Hit Time Avg s^2 > > > -------- --- ---- --- --- > > > bcm2835_handle_irq 361347 219567633 us 607.636 us 1485199 us > > > __handle_domain_irq 1082482 212639551 us 196.437 us 3642030 us > > > generic_handle_irq 1082482 100592051 us 92.927 us 50511334 us > > > irq_exit 1082482 98197771 us 90.715 us 29649040 us > > > handle_level_irq 1082482 95812379 us 88.511 us 51910093 us > > > > If I'm reading this output right, we're spending half of our interrupt > > processing time in irq_exit(), so even if dwc2's interrupt was free (the > > generic_handle_irq() chain), we'd be eating about half the CPU getting > > back out of the interrupt handler, right? > > > > I don't really know anything about DWC2 or USB, but is there any way we > > could mitigate the interrupt frequency with this hardware? If nothing > > else, could we loop reading gintsts until it reads back 0? > > first of all i updated my kernel to 4.11rc7 and the issue still occures. Today i had some time to investigate this issue further and i made some interesting observations: > > 1. The lockup doesn't occure always after starting usb_test. In rare cases i was able to run the program without lockup. > 2. In case the lockup occured we are always able to "rescue" the BCM2835 from this state by sending some serial data to the PL2303. Based on this scenario i patched the interrupt routine to detect the interrupt storm and normal condition. So i can dump the global and host registers in both situations (bad and goodcase). Here is the diff between both register dumps, maybe someone see something interesting: --- badcase.txt 2017-04-25 18:02:59.000000000 +0000 +++ goodcase.txt 2017-04-25 18:02:59.000000000 +0000 @@ -4,12 +4,12 @@ dwc2_dump_global_registers: GAHBCFG @0xCC850008 : 0x00000031 dwc2_dump_global_registers: GUSBCFG @0xCC85000C : 0x20001700 dwc2_dump_global_registers: GRSTCTL @0xCC850010 : 0x80000000 - dwc2_dump_global_registers: GINTSTS @0xCC850014 : 0x06000021 + dwc2_dump_global_registers: GINTSTS @0xCC850014 : 0x06000029 dwc2_dump_global_registers: GINTMSK @0xCC850018 : 0xF300080E - dwc2_dump_global_registers: GRXSTSR @0xCC85001C : 0x00070044 + dwc2_dump_global_registers: GRXSTSR @0xCC85001C : 0x000F0001 dwc2_dump_global_registers: GRXFSIZ @0xCC850024 : 0x00000306 dwc2_dump_global_registers: GNPTXFSIZ @0xCC850028 : 0x01000306 - dwc2_dump_global_registers: GNPTXSTS @0xCC85002C : 0x1D080100 + dwc2_dump_global_registers: GNPTXSTS @0xCC85002C : 0x18080100 dwc2_dump_global_registers: GI2CCTL @0xCC850030 : 0x00000000 dwc2_dump_global_registers: GPVNDCTL @0xCC850034 : 0x00000000 dwc2_dump_global_registers: GGPIO @0xCC850038 : 0x00000000 @@ -27,64 +27,64 @@ dwc2_dump_host_registers: Host Global Registers dwc2_dump_host_registers: HCFG @0xCC850400 : 0x00000000 dwc2_dump_host_registers: HFIR @0xCC850404 : 0x00001D4B - dwc2_dump_host_registers: HFNUM @0xCC850408 : 0x0BF03CA1 - dwc2_dump_host_registers: HPTXSTS @0xCC850410 : 0x29080200 - dwc2_dump_host_registers: HAINT @0xCC850414 : 0x00000002 - dwc2_dump_host_registers: HAINTMSK @0xCC850418 : 0x00000002 + dwc2_dump_host_registers: HFNUM @0xCC850408 : 0x029F3421 + dwc2_dump_host_registers: HPTXSTS @0xCC850410 : 0x25080200 + dwc2_dump_host_registers: HAINT @0xCC850414 : 0x00000001 + dwc2_dump_host_registers: HAINTMSK @0xCC850418 : 0x00000001 dwc2_dump_host_registers: HPRT0 @0xCC850440 : 0x00001005 dwc2_dump_host_registers: Host Channel 0 Specific Registers - dwc2_dump_host_registers: HCCHAR @0xCC850500 : 0x01189840 - dwc2_dump_host_registers: HCSPLT @0xCC850504 : 0x8001C102 - dwc2_dump_host_registers: HCINT @0xCC850508 : 0x00000000 - dwc2_dump_host_registers: HCINTMSK @0xCC85050C : 0x00000000 - dwc2_dump_host_registers: HCTSIZ @0xCC850510 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC850514 : 0x4A07D300 + dwc2_dump_host_registers: HCCHAR @0xCC850500 : 0x013C880A + dwc2_dump_host_registers: HCSPLT @0xCC850504 : 0x8000C102 + dwc2_dump_host_registers: HCINT @0xCC850508 : 0x00000022 + dwc2_dump_host_registers: HCINTMSK @0xCC85050C : 0x00000006 + dwc2_dump_host_registers: HCTSIZ @0xCC850510 : 0x4008000A + dwc2_dump_host_registers: HCDMA @0xCC850514 : 0x4A093840 dwc2_dump_host_registers: Host Channel 1 Specific Registers dwc2_dump_host_registers: HCCHAR @0xCC850520 : 0x01189840 dwc2_dump_host_registers: HCSPLT @0xCC850524 : 0x8000C102 - dwc2_dump_host_registers: HCINT @0xCC850528 : 0x00000022 - dwc2_dump_host_registers: HCINTMSK @0xCC85052C : 0x00000006 + dwc2_dump_host_registers: HCINT @0xCC850528 : 0x00000000 + dwc2_dump_host_registers: HCINTMSK @0xCC85052C : 0x00000000 dwc2_dump_host_registers: HCTSIZ @0xCC850530 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC850534 : 0x4A07D300 + dwc2_dump_host_registers: HCDMA @0xCC850534 : 0x4A07D500 dwc2_dump_host_registers: Host Channel 2 Specific Registers dwc2_dump_host_registers: HCCHAR @0xCC850540 : 0x01189840 dwc2_dump_host_registers: HCSPLT @0xCC850544 : 0x8001C102 dwc2_dump_host_registers: HCINT @0xCC850548 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC85054C : 0x00000000 dwc2_dump_host_registers: HCTSIZ @0xCC850550 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC850554 : 0x4A07D300 + dwc2_dump_host_registers: HCDMA @0xCC850554 : 0x4A07D500 dwc2_dump_host_registers: Host Channel 3 Specific Registers - dwc2_dump_host_registers: HCCHAR @0xCC850560 : 0x01189840 - dwc2_dump_host_registers: HCSPLT @0xCC850564 : 0x8001C102 + dwc2_dump_host_registers: HCCHAR @0xCC850560 : 0x00D08040 + dwc2_dump_host_registers: HCSPLT @0xCC850564 : 0x00000000 dwc2_dump_host_registers: HCINT @0xCC850568 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC85056C : 0x00000000 - dwc2_dump_host_registers: HCTSIZ @0xCC850570 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC850574 : 0x4A07D300 + dwc2_dump_host_registers: HCTSIZ @0xCC850570 : 0x8000003C + dwc2_dump_host_registers: HCDMA @0xCC850574 : 0x4A3B7544 dwc2_dump_host_registers: Host Channel 4 Specific Registers - dwc2_dump_host_registers: HCCHAR @0xCC850580 : 0x01189840 - dwc2_dump_host_registers: HCSPLT @0xCC850584 : 0x8001C102 + dwc2_dump_host_registers: HCCHAR @0xCC850580 : 0x00D00040 + dwc2_dump_host_registers: HCSPLT @0xCC850584 : 0x00000000 dwc2_dump_host_registers: HCINT @0xCC850588 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC85058C : 0x00000000 - dwc2_dump_host_registers: HCTSIZ @0xCC850590 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC850594 : 0x4A07D300 + dwc2_dump_host_registers: HCTSIZ @0xCC850590 : 0x20000008 + dwc2_dump_host_registers: HCDMA @0xCC850594 : 0x4A3B74C8 dwc2_dump_host_registers: Host Channel 5 Specific Registers dwc2_dump_host_registers: HCCHAR @0xCC8505A0 : 0x01189840 dwc2_dump_host_registers: HCSPLT @0xCC8505A4 : 0x8000C102 dwc2_dump_host_registers: HCINT @0xCC8505A8 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC8505AC : 0x00000000 dwc2_dump_host_registers: HCTSIZ @0xCC8505B0 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC8505B4 : 0x4A07D300 + dwc2_dump_host_registers: HCDMA @0xCC8505B4 : 0x4A07D500 dwc2_dump_host_registers: Host Channel 6 Specific Registers - dwc2_dump_host_registers: HCCHAR @0xCC8505C0 : 0x01189840 - dwc2_dump_host_registers: HCSPLT @0xCC8505C4 : 0x8001C102 + dwc2_dump_host_registers: HCCHAR @0xCC8505C0 : 0x00D00040 + dwc2_dump_host_registers: HCSPLT @0xCC8505C4 : 0x00000000 dwc2_dump_host_registers: HCINT @0xCC8505C8 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC8505CC : 0x00000000 - dwc2_dump_host_registers: HCTSIZ @0xCC8505D0 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC8505D4 : 0x4A07D300 + dwc2_dump_host_registers: HCTSIZ @0xCC8505D0 : 0x00000000 + dwc2_dump_host_registers: HCDMA @0xCC8505D4 : 0x4B080000 dwc2_dump_host_registers: Host Channel 7 Specific Registers - dwc2_dump_host_registers: HCCHAR @0xCC8505E0 : 0x01189840 + dwc2_dump_host_registers: HCCHAR @0xCC8505E0 : 0x013C880A dwc2_dump_host_registers: HCSPLT @0xCC8505E4 : 0x8001C102 dwc2_dump_host_registers: HCINT @0xCC8505E8 : 0x00000000 dwc2_dump_host_registers: HCINTMSK @0xCC8505EC : 0x00000000 - dwc2_dump_host_registers: HCTSIZ @0xCC8505F0 : 0x00080040 - dwc2_dump_host_registers: HCDMA @0xCC8505F4 : 0x4A07D300 + dwc2_dump_host_registers: HCTSIZ @0xCC8505F0 : 0x4008000A + dwc2_dump_host_registers: HCDMA @0xCC8505F4 : 0x4A093840