From patchwork Wed Jan 25 13:46:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kalle Valo X-Patchwork-Id: 9537073 X-Patchwork-Delegate: kvalo@adurom.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 86761601D3 for ; Wed, 25 Jan 2017 13:47:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7843E2823E for ; Wed, 25 Jan 2017 13:47:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6BE5F282E2; Wed, 25 Jan 2017 13:47:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, T_DKIM_INVALID autolearn=no version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 04BC82823E for ; Wed, 25 Jan 2017 13:47:02 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1cWNua-0004w0-8p; Wed, 25 Jan 2017 13:46:56 +0000 Received: from wolverine02.qualcomm.com ([199.106.114.251]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1cWNuX-0004jT-Hz for ath10k@lists.infradead.org; Wed, 25 Jan 2017 13:46:55 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=qca.qualcomm.com; i=@qca.qualcomm.com; q=dns/txt; s=qcdkim; t=1485352013; x=1516888013; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=WwqHM6Im7yWbPAWxGQhUmxfqYc5MOLXp+hqyRVyuJiU=; b=qfOlmuMgR0A2CgvRuL+xXkvhbjJIjYrao1ukuIkS0zoLZPfSDqV0fjyj TCENpR/4cFcSkKb/HVCOTaDaE6wue7C1XfnqItdTIVFgVMwLv2MqqHlRB nSV7DPUT09K3RvVxYKf4uE34K6Hm17rcV3XpSWJFPfkMvzmOK1bCGrh1D 8=; X-IronPort-AV: E=Sophos;i="5.33,283,1477983600"; d="scan'208";a="353560460" Received: from unknown (HELO Ironmsg04-L.qualcomm.com) ([10.53.140.111]) by wolverine02.qualcomm.com with ESMTP; 25 Jan 2017 05:46:32 -0800 X-IronPort-AV: E=McAfee;i="5700,7163,8418"; a="1278676112" Received: from nasanexm02d.na.qualcomm.com ([10.85.0.44]) by Ironmsg04-L.qualcomm.com with ESMTP/TLS/RC4-SHA; 25 Jan 2017 05:46:31 -0800 Received: from APSANEXR01B.ap.qualcomm.com (10.85.0.37) by NASANEXM02D.na.qualcomm.com (10.85.0.44) with Microsoft SMTP Server (TLS) id 15.0.1178.4; Wed, 25 Jan 2017 05:46:31 -0800 Received: from eusanexr01a.eu.qualcomm.com (10.85.0.97) by APSANEXR01B.ap.qualcomm.com (10.85.0.37) with Microsoft SMTP Server (TLS) id 15.0.1178.4; Wed, 25 Jan 2017 05:46:28 -0800 Received: from eusanexr01a.eu.qualcomm.com ([10.85.0.97]) by eusanexr01a.eu.qualcomm.com ([10.85.0.97]) with mapi id 15.00.1178.000; Wed, 25 Jan 2017 05:46:28 -0800 From: "Valo, Kalle" To: "Shajakhan, Mohammed Shafi (Mohammed Shafi)" Subject: Re: [PATCH v3] ath10k: Fix crash during rmmod when probe firmware fails Thread-Topic: [PATCH v3] ath10k: Fix crash during rmmod when probe firmware fails Thread-Index: AQHSdxFtKx7t/h0nAkC0yszeRyCwzg== Date: Wed, 25 Jan 2017 13:46:28 +0000 Message-ID: <871svr8d83.fsf@kamboji.qca.qualcomm.com> References: <1482221351-24029-1-git-send-email-mohammed@qca.qualcomm.com> <8760l38dz0.fsf@kamboji.qca.qualcomm.com> In-Reply-To: <8760l38dz0.fsf@kamboji.qca.qualcomm.com> (Kalle Valo's message of "Wed, 25 Jan 2017 15:29:39 +0200") Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.80.80.8] MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170125_054653_800327_FF624C3F X-CRM114-Status: GOOD ( 17.32 ) X-BeenThere: ath10k@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "mohammed@codeaurora.org" , "linux-wireless@vger.kernel.org" , "ath10k@lists.infradead.org" Sender: "ath10k" Errors-To: ath10k-bounces+patchwork-ath10k=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Kalle Valo writes: > Mohammed Shafi Shajakhan writes: > >> From: Mohammed Shafi Shajakhan >> >> This fixes the below crash when ath10k probe firmware fails, >> NAPI polling tries to access a rx ring resource which was never >> allocated, fix this by disabling NAPI right away once the probe >> firmware fails by calling 'ath10k_hif_stop'. Its good to note >> that the error is never propogated to 'ath10k_pci_probe' when >> ath10k_core_register fails, so calling 'ath10k_hif_stop' to cleanup >> PCI related things seems to be ok >> >> BUG: unable to handle kernel NULL pointer dereference at (null) >> IP: __ath10k_htt_rx_ring_fill_n+0x19/0x230 [ath10k_core] >> __ath10k_htt_rx_ring_fill_n+0x19/0x230 [ath10k_core] >> >> Call Trace: >> >> [] ath10k_htt_rx_msdu_buff_replenish+0x42/0x90 >> [ath10k_core] >> [] ath10k_htt_txrx_compl_task+0x433/0x17d0 >> [ath10k_core] >> [] ? __wake_up_common+0x4d/0x80 >> [] ? cpu_load_update+0xdc/0x150 >> [] ? ath10k_pci_read32+0xd/0x10 [ath10k_pci] >> [] ath10k_pci_napi_poll+0x47/0x110 [ath10k_pci] >> [] net_rx_action+0x20f/0x370 >> >> Reported-by: Ben Greear >> Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support") >> Signed-off-by: Mohammed Shafi Shajakhan > > Is there an easy way to reproduce this bug? I don't see it on my x86 > laptop with qca988x and I call rmmod all the time. I would like to test > this myself. > >> --- a/drivers/net/wireless/ath/ath10k/core.c >> +++ b/drivers/net/wireless/ath/ath10k/core.c >> @@ -2164,6 +2164,7 @@ static int ath10k_core_probe_fw(struct ath10k *ar) >> ath10k_core_free_firmware_files(ar); >> >> err_power_down: >> + ath10k_hif_stop(ar); >> ath10k_hif_power_down(ar); >> >> return ret; > > This breaks the symmetry, we should not be calling ath10k_hif_stop() if > we haven't called ath10k_hif_start() from the same function. This can > just create a bigger mess later, for example with other bus support like > sdio or usb. In theory it should enough that we call > ath10k_hif_power_down() and pci.c does the rest correctly "behind the > scenes". > > I investigated this a bit and I think the real cause is that we call > napi_enable() from ath10k_pci_hif_power_up() and napi_disable() from > ath10k_pci_hif_stop(). Does anyone remember why? > > I was expecting that we would call napi_enable()/napi_disable() either > in ath10k_hif_power_up/down() or ath10k_hif_start()/stop(), but not > mixed like it's currently. So below is something I was thinking of, now napi_enable() is called from ath10k_hif_start() and napi_disable() from ath10k_hif_stop(). Would that work? --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -1648,6 +1648,8 @@ static int ath10k_pci_hif_start(struct ath10k *ar) ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif start\n"); + napi_enable(&ar->napi); + ath10k_pci_irq_enable(ar); ath10k_pci_rx_post(ar); @@ -2532,7 +2534,6 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar) ath10k_err(ar, "could not wake up target CPU: %d\n", ret); goto err_ce; } - napi_enable(&ar->napi); return 0;