From patchwork Thu May 24 13:35:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathias Nyman X-Patchwork-Id: 10424755 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4912F6019D for ; Thu, 24 May 2018 13:33:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41668289B4 for ; Thu, 24 May 2018 13:33:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34AB5291AA; Thu, 24 May 2018 13:33:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,T_TVD_MIME_EPI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7CBD1289B4 for ; Thu, 24 May 2018 13:33:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968746AbeEXNdM (ORCPT ); Thu, 24 May 2018 09:33:12 -0400 Received: from mga03.intel.com ([134.134.136.65]:25774 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965390AbeEXNdJ (ORCPT ); Thu, 24 May 2018 09:33:09 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 May 2018 06:33:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,436,1520924400"; d="scan'208,223";a="202122036" Received: from mattu-haswell.fi.intel.com (HELO [10.237.72.164]) ([10.237.72.164]) by orsmga004.jf.intel.com with ESMTP; 24 May 2018 06:33:07 -0700 Subject: Re: usb HC busted? To: Sudip Mukherjee Cc: Mathias Nyman , linux-usb@vger.kernel.org, lukaszx.szulc@intel.com References: <20180518100650.kfw6wijpncpvqx7j@debian> <6790b352-add3-5531-115c-15db6c9c744d@intel.com> <20180518130458.v73syr3fltdzdzzi@debian> <881d576b-c7c1-ef74-c6bc-68b81371e7e0@intel.com> <20180523212956.n4ztasdffg2aeaku@debian> From: Mathias Nyman Message-ID: <80eace7a-976d-65a5-a353-54a2b18edd06@linux.intel.com> Date: Thu, 24 May 2018 16:35:34 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180523212956.n4ztasdffg2aeaku@debian> Content-Language: en-US Sender: linux-usb-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-usb@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi On 24.05.2018 00:29, Sudip Mukherjee wrote: > Hi Mathias, > > On Fri, May 18, 2018 at 04:19:02PM +0300, Mathias Nyman wrote: >> On 18.05.2018 16:04, Sudip Mukherjee wrote: >>> Hi Mathias, >>> >>> On Fri, May 18, 2018 at 03:55:04PM +0300, Mathias Nyman wrote: >>>> Hi, >>>> >>>> Looks like event for Transfer block (TRB) at 0x32a21060 was never completed, >>>> or at least not handled by xhci driver. >>>> (either the event was never issued by hw, or something got messed up in the driver along the way) >>>> >>>> HC doesn't look busted, it continues sending transfer completions events. >>>> it is already at event 0x32a211d0, which is 23 TRBS later. (one TRB is 0x10) >>>> >>>> This small log sinppet doesnt' say much about the reasons. >>>> >>>> Can you enable tracing for xhci and send me the output. >>> > We have finally reproduced the error while traces were on. The trace and > the relevant part of the dmesg (when the error starts) are in: > https://drive.google.com/open?id=1PbcYwL1a9ndtHw1MNjE6uVqb0fFX9jV8 > > Will request you to have a look and suggest what might be going wrong here. > Log show two rings having the same TRB segment dma address, this will completely mess up the transfer: While allocating rigs the enque pointers for the two rings are the same: 461.859315: xhci_ring_alloc: ISOC efa4e580: enq 0x0000000033386000(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000) segs 2 stream 0 ...bs 461.859320: xhci_ring_alloc: ISOC f0ce1f00: enq 0x0000000033386000(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000) segs 2 stream 0 ... URBs for ISOC IN transfers are queued on EP3 at enqueue address (33386000 to 33386140) 461.859998: xhci_urb_enqueue: ep3in-isoc: urb f0ec0e00 pipe 4294528 slot 8 length 0/170 sgs 0/0 stream 0 flags 00010302 461.860004: xhci_queue_trb: ISOC: Buffer 000000002b480240 length 17 TD size 0 intr 0 type 'Isoch' flags b:i:I:c:s:I:e:c 461.860006: xhci_inc_enq: ISOC f0ce1f00: enq 0x0000000033386010(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000 Later URBs for ISOC OUT transfers are queued at the same address, this should not happen: 461.901175: xhci_urb_enqueue: ep3out-isoc: urb ecec2600 pipe 100096 slot 8 length 0/51 sgs 0/0 stream 0 flags 00010002 461.901180: xhci_queue_trb: ISOC: Buffer 000000002d9fa805 length 17 TD size 0 intr 0 type 'Isoch' flags b:i:I:c:s:i:e:c 461.901181: xhci_inc_enq: ISOC efa4e580: enq 0x0000000033386010(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000) So something goes really wrong when allocating or setting up the rings in one of these functions: xhci_ring_alloc() xhci_alloc_segments_for_ring() xhci_initialize_ring_info() xhci_segment_alloc() xhci_link_segments() dma_pool_zalloc() To verify and rule out dma_pool_zalloc(), could you apply the attached patch and reproduce with new logs? Thanks -Mathias From 7aee4db28204fddff6cbc1534b8d50f13fd0b141 Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Thu, 24 May 2018 15:37:41 +0300 Subject: [PATCH] xhci: testpatch, add custom trace for ring segment alloc for custom debugging only --- drivers/usb/host/xhci-mem.c | 10 ++++++++++ drivers/usb/host/xhci-trace.h | 5 +++++ 2 files changed, 15 insertions(+) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index e5ace89..7d343ad 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -44,10 +44,15 @@ static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci, return NULL; } + xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail, + "MATTU xhci_segment_alloc dma @ %pad", &dma); + if (max_packet) { seg->bounce_buf = kzalloc(max_packet, flags); if (!seg->bounce_buf) { dma_pool_free(xhci->segment_pool, seg->trbs, dma); + xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail, + "MATTU xhci segment free dma @ %pad", &dma); kfree(seg); return NULL; } @@ -58,6 +63,9 @@ static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci, seg->trbs[i].link.control |= cpu_to_le32(TRB_CYCLE); } seg->dma = dma; + xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail, + "MATTU xhci segment alloc seg->dma @ %pad", &seg->dma); + seg->next = NULL; return seg; @@ -67,6 +75,8 @@ static void xhci_segment_free(struct xhci_hcd *xhci, struct xhci_segment *seg) { if (seg->trbs) { dma_pool_free(xhci->segment_pool, seg->trbs, seg->dma); + xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail, + "MATTU xhci segment free seg->dma @ %p", &seg->dma); seg->trbs = NULL; } kfree(seg->bounce_buf); diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-trace.h index 35bdd06..951e371 100644 --- a/drivers/usb/host/xhci-trace.h +++ b/drivers/usb/host/xhci-trace.h @@ -72,6 +72,11 @@ DEFINE_EVENT(xhci_log_msg, xhci_dbg_ring_expansion, TP_ARGS(vaf) ); +DEFINE_EVENT(xhci_log_msg, xhci_ring_mem_detail, + TP_PROTO(struct va_format *vaf), + TP_ARGS(vaf) +); + DECLARE_EVENT_CLASS(xhci_log_ctx, TP_PROTO(struct xhci_hcd *xhci, struct xhci_container_ctx *ctx, unsigned int ep_num), -- 2.7.4