From patchwork Tue Oct 10 08:26:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathias Nyman X-Patchwork-Id: 9995303 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 82D6D60216 for ; Tue, 10 Oct 2017 08:22:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71D6B28236 for ; Tue, 10 Oct 2017 08:22:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 669612832D; Tue, 10 Oct 2017 08:22:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D69CD2832B for ; Tue, 10 Oct 2017 08:22:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755697AbdJJIWJ (ORCPT ); Tue, 10 Oct 2017 04:22:09 -0400 Received: from mga07.intel.com ([134.134.136.100]:7653 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755672AbdJJIWJ (ORCPT ); Tue, 10 Oct 2017 04:22:09 -0400 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP; 10 Oct 2017 01:22:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,504,1500966000"; d="scan'208";a="908448295" Received: from mattu-haswell.fi.intel.com (HELO [10.237.72.164]) ([10.237.72.164]) by FMSMGA003.fm.intel.com with ESMTP; 10 Oct 2017 01:22:05 -0700 Subject: Re: [bugzilla-daemon@bugzilla.kernel.org: [Bug 197159] New: Xhci host controller not responding starting kernel 4.13] To: Bjorn Helgaas , Mason References: <20171009170108.GK25517@bhelgaas-glaptop.roam.corp.google.com> <50d880c5-f76b-cb15-0faa-af9fa617ea9a@free.fr> <20171009233852.GO25517@bhelgaas-glaptop.roam.corp.google.com> Cc: Niklas , linux-pci , linux-usb , Mathias Nyman , Lukas Wunner , Greg Kroah-Hartman , Felipe Balbi , Alan Stern From: Mathias Nyman Message-ID: <59DC8418.2050808@linux.intel.com> Date: Tue, 10 Oct 2017 11:26:00 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <20171009233852.GO25517@bhelgaas-glaptop.roam.corp.google.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 10.10.2017 02:38, Bjorn Helgaas wrote: > On Mon, Oct 09, 2017 at 10:45:39PM +0200, Mason wrote: >> On 09/10/2017 19:01, Bjorn Helgaas wrote: >> ... > >>> In that thread, Mason reported a regression that looks similar, but as >>> far as I can tell, we never identified a root cause. >>> >>> 1) The problem Mason reported was on a Tango platform, which has a >>> known hardware issue that corrupts data when simultaneous config >>> and MMIO accesses occur. You're seeing the problem on a >>> different platform, which is very helpful. >> >> As mentioned here: >> https://www.mail-archive.com/linux-usb@vger.kernel.org/msg94020.html >> >> When I disable the AER driver, not a single config space access >> occurs when a USB drive is unplugged. So I'm 99.99% sure that >> the issue is NOT caused by tango's bad design. (I got the vibe >> that nobody cared about tango's issue because it was assumed >> that the design flaw was responsible for it.) > > I agree; I don't think this is Tango's fault. > > Can you test fe190ed0d602 and d9f11ba9f107 to determine whether > d9f11ba9f107 is the culprit? If it is the culprit, can you try reverting > it on a current kernel to see if that fixes it? > > If d9f11ba9f107 is not the culprit, can you bisect to discover exactly > where it broke? > If possible could the bug reporter add the same WARN is Mason to see when xhci reads 0xffffffff, or if something else triggers xhci_hc_died() In the Tango case it was the hub thread clearing a port reset change event. Mathias diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 82c746e..cd3a420 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -908,6 +908,8 @@ void xhci_hc_died(struct xhci_hcd *xhci) { int i, j; + WARN_ON(1); if (xhci->xhc_state & XHCI_STATE_DYING) return; Thanks