From patchwork Tue Dec 19 22:05:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jon Hunter X-Patchwork-Id: 10124647 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3330D603B5 for ; Tue, 19 Dec 2017 22:08:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2035229651 for ; Tue, 19 Dec 2017 22:08:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1353C29655; Tue, 19 Dec 2017 22:08:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A127229651 for ; Tue, 19 Dec 2017 22:08:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:References:To:From:Subject:Reply-To:Content-ID:Content-Description :Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=KdUKfbQKcHx35GYGA+vD3FZZSwojhVNfQOCh5VcpXKg=; b=KosDb6Er9ob2Pv qiPhCF/10hMrzmwLJVROC1WTxlNwkT3Joz/y4K5OryKA/gz/bLvA3R1Z+ng17QCPd4q3IvTrbUTDd 9KNonTmAN27osBYQoQ7Co7Y2rdrfJu/cVKWI5mMcHMwRMklYQj9YnhCm1vrWFvV+0m4K0f+68U6iL tfn9uGToeH8T5vBHwoHxxII9iciA8DpyjPLc9Ue7S5TVVrZiBrbOxrcu6/oaZHLKeDgGa8TmRNFob ZuRADTZ12XiXK+x4rxYpTp0uxRU2esOv8UzbGKfKqjDekLvjAp/xWexgU+SpgMa7iBrGhRDS1B03Z PekC0MQrdzlP3mFJcOxQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.89 #1 (Red Hat Linux)) id 1eRQ42-00012a-9S; Tue, 19 Dec 2017 22:08:42 +0000 Received: from hqemgate16.nvidia.com ([216.228.121.65]) by bombadil.infradead.org with esmtps (Exim 4.89 #1 (Red Hat Linux)) id 1eRQ1I-0007HK-3B for linux-arm-kernel@lists.infradead.org; Tue, 19 Dec 2017 22:05:55 +0000 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com id ; Tue, 19 Dec 2017 14:05:48 -0800 Received: from HQMAIL106.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Tue, 19 Dec 2017 14:06:29 -0800 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Tue, 19 Dec 2017 14:06:29 -0800 Received: from UKMAIL101.nvidia.com (10.26.138.13) by HQMAIL106.nvidia.com (172.18.146.12) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Tue, 19 Dec 2017 22:05:39 +0000 Received: from [10.26.11.57] (10.26.11.57) by UKMAIL101.nvidia.com (10.26.138.13) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Tue, 19 Dec 2017 22:05:36 +0000 Subject: Re: [PATCH 1/2] drm/nouveau/bar/gf100: fix hang when calling ->fini() before ->init() From: Jon Hunter To: Guillaume Tucker , Ben Skeggs References: <6db622ac-319d-c640-91ab-9248e528b69b@nvidia.com> Message-ID: <6138a5ee-74e0-350a-ffc0-239bf3275914@nvidia.com> Date: Tue, 19 Dec 2017 22:05:34 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.26.11.57] X-ClientProxiedBy: UKMAIL102.nvidia.com (10.26.138.15) To UKMAIL101.nvidia.com (10.26.138.13) Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20171219_140552_307985_86A19C73 X-CRM114-Status: GOOD ( 20.28 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: David Airlie , linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On 06/12/17 17:18, Jon Hunter wrote: > > On 06/12/17 09:22, Guillaume Tucker wrote: >> On 05/12/17 18:32, Ben Skeggs wrote: >>> On Wed, Dec 6, 2017 at 12:30 AM, Jon Hunter wrote: >>> >>>> >>>> On 04/12/17 18:37, Guillaume Tucker wrote: >>>>> If the firmware fails to load then ->fini() will be called before the >>>>> device has been initialised, causing the kernel to hang while trying >>>>> to write to a register.  Add a test in ->fini() to avoid this issue. >>>>> >>>>> This fixes a kernel hang on tegra124. >>>>> >>>>> Fixes: b17de35a2ebbe ("drm/nouveau/bar: implement bar1 teardown") >>>>> Signed-off-by: Guillaume Tucker >>>>> CC: Ben Skeggs >>>>> --- >>>>>  drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c | 7 +++++-- >>>>>  1 file changed, 5 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c >>>> b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c >>>>> index a3ba7f50198b..95e2aba64aad 100644 >>>>> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c >>>>> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c >>>>> @@ -43,9 +43,12 @@ gf100_bar_bar1_wait(struct nvkm_bar *base) >>>>>  } >>>>> >>>>>  void >>>>> -gf100_bar_bar1_fini(struct nvkm_bar *bar) >>>>> +gf100_bar_bar1_fini(struct nvkm_bar *base) >>>>>  { >>>>> -     nvkm_mask(bar->subdev.device, 0x001704, 0x80000000, 0x00000000); >>>>> +     struct nvkm_device *device = base->subdev.device; >>>>> + >>>>> +     if (base->subdev.oneinit) >>>>> +             nvkm_mask(device, 0x001704, 0x80000000, 0x00000000); >>>>>  } >>>>> >>>>>  void >>>> >>>> I have tested this and it works for me. Thanks for fixing this! Would be >>>> good to get Ben's ACK, but you can have my ... >>>> >>> I'd love to get a good explanation as to why it hangs without this >>> change, >>> as, on the surface, it's not immediately obvious as to why it's hanging. >> >> To be fair I'm not entirely sure either why this causes a hang, I >> haven't read the TRM...  The iomem has been mapped at this point, >> so accessing the register should work.  One clue is when you look >> at _bar1_init(), the 0x1704 register is initialised with >> some (device instance?) memory address.  So it's possible that >> the hardware does something special when you set this to 0 as in >> _bar1_fini(), which may fail in particular if it was previously >> not initialised with a valid address. >> >> This is merely guesswork, would be interested to find out the >> real explanation though. > > OK, well that's no good. It's a good pointer, but we need to make sure > we understand the root of this hang. I will see if I have sometime to > dig into this further, maybe next week. I spent a bit of time looking at this, but I still do not fully understand the cause of the hang. It appears to hang initialisation of the FB subdev and it appears to be around the point where the L2 cache is flushed. I will see if anyone else has a clue what is happening here. Ben, in the meantime with the holiday season upon us, should we remove the bar1 teardown for gk20a? Cheers Jon diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c index 9646adec57cb..243f0a5c8a62 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c @@ -73,7 +73,8 @@ struct nvkm_vmm * nvkm_bar_fini(struct nvkm_subdev *subdev, bool suspend) { struct nvkm_bar *bar = nvkm_bar(subdev); - bar->func->bar1.fini(bar); + if (bar->func->bar1.fini) + bar->func->bar1.fini(bar); return 0; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c index b10077d38839..35878fb538f2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c @@ -26,7 +26,6 @@ .dtor = gf100_bar_dtor, .oneinit = gf100_bar_oneinit, .bar1.init = gf100_bar_bar1_init, - .bar1.fini = gf100_bar_bar1_fini, .bar1.wait = gf100_bar_bar1_wait, .bar1.vmm = gf100_bar_bar1_vmm, .flush = g84_bar_flush,