diff mbox series

[v3] wifi: ath12k: Add firmware coredump collection support

Message ID 20240325183414.4016663-1-quic_ssreeela@quicinc.com (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show
Series [v3] wifi: ath12k: Add firmware coredump collection support | expand

Commit Message

Sowmiya Sree Elavalagan March 25, 2024, 6:34 p.m. UTC
In case of firmware assert snapshot of firmware memory is essential for
debugging. Add firmware coredump collection support for PCI bus.
Collect RDDM and firmware paging dumps from MHI and pack them in TLV
format and also pack various memory shared during QMI phase in separate
TLVs.  Add necessary header and share the dumps to user space using dev
coredump framework. Coredump collection is disabled by default and can
be enabled using menuconfig. Dump collected for a radio is 55 MB
approximately.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1

Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
---
v2:
  - Fixed errors shown by ath12k-check
v3:
  - Fixed SPDX comment style for coredump.c file
    Changed Kconfig description.
---
 drivers/net/wireless/ath/ath12k/Kconfig    |  10 ++
 drivers/net/wireless/ath/ath12k/Makefile   |   1 +
 drivers/net/wireless/ath/ath12k/core.c     |   2 +
 drivers/net/wireless/ath/ath12k/core.h     |   5 +
 drivers/net/wireless/ath/ath12k/coredump.c |  51 ++++++
 drivers/net/wireless/ath/ath12k/coredump.h |  80 +++++++++
 drivers/net/wireless/ath/ath12k/hif.h      |   9 +-
 drivers/net/wireless/ath/ath12k/hw.c       |   4 +-
 drivers/net/wireless/ath/ath12k/mhi.c      |   5 +
 drivers/net/wireless/ath/ath12k/mhi.h      |   4 +-
 drivers/net/wireless/ath/ath12k/pci.c      | 185 +++++++++++++++++++++
 11 files changed, 351 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/wireless/ath/ath12k/coredump.c
 create mode 100644 drivers/net/wireless/ath/ath12k/coredump.h

Comments

Jeff Johnson March 25, 2024, 8:15 p.m. UTC | #1
On 3/25/2024 11:34 AM, Sowmiya Sree Elavalagan wrote:
> In case of firmware assert snapshot of firmware memory is essential for
> debugging. Add firmware coredump collection support for PCI bus.
> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
> format and also pack various memory shared during QMI phase in separate
> TLVs.  Add necessary header and share the dumps to user space using dev
> coredump framework. Coredump collection is disabled by default and can
> be enabled using menuconfig. Dump collected for a radio is 55 MB
> approximately.
> 
> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
> 
> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>

Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Kalle Valo June 20, 2024, 2:08 p.m. UTC | #2
Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com> wrote:

> In case of firmware assert snapshot of firmware memory is essential for
> debugging. Add firmware coredump collection support for PCI bus.
> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
> format and also pack various memory shared during QMI phase in separate
> TLVs.  Add necessary header and share the dumps to user space using dev
> coredump framework. Coredump collection is disabled by default and can
> be enabled using menuconfig. Dump collected for a radio is 55 MB
> approximately.
> 
> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
> 
> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>

This didn't compile for me, I added this to pci.c:

+#include <linux/vmalloc.h>

Also in the pending branch I made some whitespace in struct ath12k_dump_file_data:

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=44ae07628b68375f476895f4fc1e89a570790ac0

Any tips how to test this until we have the debugfs interface to crash the firmware?
Kalle Valo June 26, 2024, 3:09 p.m. UTC | #3
Kalle Valo <kvalo@kernel.org> writes:

> Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com> wrote:
>
>> In case of firmware assert snapshot of firmware memory is essential for
>> debugging. Add firmware coredump collection support for PCI bus.
>> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
>> format and also pack various memory shared during QMI phase in separate
>> TLVs.  Add necessary header and share the dumps to user space using dev
>> coredump framework. Coredump collection is disabled by default and can
>> be enabled using menuconfig. Dump collected for a radio is 55 MB
>> approximately.
>> 
>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
>> 
>> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
>> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
>> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
>
> This didn't compile for me, I added this to pci.c:
>
> +#include <linux/vmalloc.h>
>
> Also in the pending branch I made some whitespace in struct ath12k_dump_file_data:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=44ae07628b68375f476895f4fc1e89a570790ac0
>
> Any tips how to test this until we have the debugfs interface to crash the firmware?

I was able to get patch 'wifi: ath12k: Add support to simulate firmware
crash' (not yet public) and did a quick test with it. There seems to be
a KASAN warning but I can't debug this further at this time.

[ 8091.304272] ath12k_pci 0000:06:00.0: simulating firmware assert crash
[ 8091.722245] ==================================================================
[ 8091.722329] BUG: KASAN: vmalloc-out-of-bounds in ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
[ 8091.722433] Write of size 4 at addr ffffc9000644b28c by task kworker/u32:0/11
[ 8091.722517] 
[ 8091.722552] CPU: 0 PID: 11 Comm: kworker/u32:0 Not tainted 6.10.0-rc4-wt-ath+ #1663
[ 8091.722604] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
[ 8091.722670] Workqueue: ath12k_aux_wq ath12k_core_reset [ath12k]
[ 8091.722742] Call Trace:
[ 8091.722778]  <TASK>
[ 8091.722832]  dump_stack_lvl+0x7d/0xe0
[ 8091.722920]  print_address_description.constprop.0+0x33/0x3a0
[ 8091.722999]  print_report+0xb5/0x260
[ 8091.723069]  ? kasan_addr_to_slab+0xd/0x80
[ 8091.723146]  kasan_report+0xd8/0x110
[ 8091.723217]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
[ 8091.723301]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
[ 8091.723386]  __asan_report_store_n_noabort+0x12/0x20
[ 8091.723461]  ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
[ 8091.723563]  ? ath12k_pci_coredump_calculate_size+0x730/0x730 [ath12k]
[ 8091.723632]  ? __this_cpu_preempt_check+0x13/0x20
[ 8091.723677]  ath12k_coredump_collect+0x60/0x73 [ath12k]
[ 8091.724276]  ath12k_core_reset+0x1b1/0x880 [ath12k]
[ 8091.724921]  ? _raw_spin_unlock_irq+0x22/0x50
[ 8091.725503]  ? __this_cpu_preempt_check+0x13/0x20
[ 8091.726126]  process_one_work+0x8d7/0x19f0
[ 8091.726718]  ? pwq_dec_nr_in_flight+0x580/0x580
[ 8091.727346]  ? move_linked_works+0x128/0x2c0
[ 8091.727998]  ? assign_work+0x15e/0x270
[ 8091.728601]  worker_thread+0x715/0x1270
[ 8091.729244]  ? rescuer_thread+0xdb0/0xdb0
[ 8091.729905]  kthread+0x2fa/0x3f0
[ 8091.730520]  ? kthread_insert_work_sanity_check+0xd0/0xd0
[ 8091.731192]  ret_from_fork+0x31/0x70
[ 8091.731856]  ? kthread_insert_work_sanity_check+0xd0/0xd0
[ 8091.732525]  ret_from_fork_asm+0x11/0x20
[ 8091.733212]  </TASK>
[ 8091.733909] 
[ 8091.734559] The buggy address belongs to the virtual mapping at#012[ 8091.734559]  [ffffc9000500b000, ffffc9000644d000) created by:#012[ 8091.734559]  ath12k_pci_coredump_download+0x147/0x1330 [ath12k]
[ 8091.736558] 
[ 8091.737272] The buggy address belongs to the physical page:
[ 8091.738016] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x15a485
[ 8091.738730] flags: 0x200000000000000(node=0|zone=2)
[ 8091.739481] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
[ 8091.740256] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 8091.741043] page dumped because: kasan: bad access detected
[ 8091.741786] 
[ 8091.742529] Memory state around the buggy address:
[ 8091.743296]  ffffc9000644b180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 8091.744087]  ffffc9000644b200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 8091.744834] >ffffc9000644b280: 00 04 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 8091.745598]                       ^
[ 8091.746359]  ffffc9000644b300: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 8091.747152]  ffffc9000644b380: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
[ 8091.747932] ==================================================================
[ 8091.748688] Disabling lock debugging due to kernel taint
[ 8091.749699] ath12k_pci 0000:06:00.0: Uploading coredump
Sowmiya Sree Elavalagan June 27, 2024, 3:17 p.m. UTC | #4
On 6/26/2024 8:39 PM, Kalle Valo wrote:
> Kalle Valo <kvalo@kernel.org> writes:
> 
>> Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com> wrote:
>>
>>> In case of firmware assert snapshot of firmware memory is essential for
>>> debugging. Add firmware coredump collection support for PCI bus.
>>> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
>>> format and also pack various memory shared during QMI phase in separate
>>> TLVs.  Add necessary header and share the dumps to user space using dev
>>> coredump framework. Coredump collection is disabled by default and can
>>> be enabled using menuconfig. Dump collected for a radio is 55 MB
>>> approximately.
>>>
>>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
>>>
>>> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
>>> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
>>> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
>>
>> This didn't compile for me, I added this to pci.c:
>>
>> +#include <linux/vmalloc.h>
>>
>> Also in the pending branch I made some whitespace in struct ath12k_dump_file_data:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=44ae07628b68375f476895f4fc1e89a570790ac0
>>
>> Any tips how to test this until we have the debugfs interface to crash the firmware?
> 
> I was able to get patch 'wifi: ath12k: Add support to simulate firmware
> crash' (not yet public) and did a quick test with it. There seems to be
> a KASAN warning but I can't debug this further at this time.
> 
> [ 8091.304272] ath12k_pci 0000:06:00.0: simulating firmware assert crash
> [ 8091.722245] ==================================================================
> [ 8091.722329] BUG: KASAN: vmalloc-out-of-bounds in ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.722433] Write of size 4 at addr ffffc9000644b28c by task kworker/u32:0/11
> [ 8091.722517] 
> [ 8091.722552] CPU: 0 PID: 11 Comm: kworker/u32:0 Not tainted 6.10.0-rc4-wt-ath+ #1663
> [ 8091.722604] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
> [ 8091.722670] Workqueue: ath12k_aux_wq ath12k_core_reset [ath12k]
> [ 8091.722742] Call Trace:
> [ 8091.722778]  <TASK>
> [ 8091.722832]  dump_stack_lvl+0x7d/0xe0
> [ 8091.722920]  print_address_description.constprop.0+0x33/0x3a0
> [ 8091.722999]  print_report+0xb5/0x260
> [ 8091.723069]  ? kasan_addr_to_slab+0xd/0x80
> [ 8091.723146]  kasan_report+0xd8/0x110
> [ 8091.723217]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723301]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723386]  __asan_report_store_n_noabort+0x12/0x20
> [ 8091.723461]  ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723563]  ? ath12k_pci_coredump_calculate_size+0x730/0x730 [ath12k]
> [ 8091.723632]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.723677]  ath12k_coredump_collect+0x60/0x73 [ath12k]
> [ 8091.724276]  ath12k_core_reset+0x1b1/0x880 [ath12k]
> [ 8091.724921]  ? _raw_spin_unlock_irq+0x22/0x50
> [ 8091.725503]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.726126]  process_one_work+0x8d7/0x19f0
> [ 8091.726718]  ? pwq_dec_nr_in_flight+0x580/0x580
> [ 8091.727346]  ? move_linked_works+0x128/0x2c0
> [ 8091.727998]  ? assign_work+0x15e/0x270
> [ 8091.728601]  worker_thread+0x715/0x1270
> [ 8091.729244]  ? rescuer_thread+0xdb0/0xdb0
> [ 8091.729905]  kthread+0x2fa/0x3f0
> [ 8091.730520]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.731192]  ret_from_fork+0x31/0x70
> [ 8091.731856]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.732525]  ret_from_fork_asm+0x11/0x20
> [ 8091.733212]  </TASK>
> [ 8091.733909] 
> [ 8091.734559] The buggy address belongs to the virtual mapping at#012[ 8091.734559]  [ffffc9000500b000, ffffc9000644d000) created by:#012[ 8091.734559]  ath12k_pci_coredump_download+0x147/0x1330 [ath12k]
> [ 8091.736558] 
> [ 8091.737272] The buggy address belongs to the physical page:
> [ 8091.738016] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x15a485
> [ 8091.738730] flags: 0x200000000000000(node=0|zone=2)
> [ 8091.739481] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
> [ 8091.740256] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> [ 8091.741043] page dumped because: kasan: bad access detected
> [ 8091.741786] 
> [ 8091.742529] Memory state around the buggy address:
> [ 8091.743296]  ffffc9000644b180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744087]  ffffc9000644b200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744834] >ffffc9000644b280: 00 04 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.745598]                       ^
> [ 8091.746359]  ffffc9000644b300: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747152]  ffffc9000644b380: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747932] ==================================================================
> [ 8091.748688] Disabling lock debugging due to kernel taint
> [ 8091.749699] ath12k_pci 0000:06:00.0: Uploading coredump
> 
Hi Kalle,

I will check and update you on this.

Thanks,
Sowmiya Sree
Sidhanta Sahu June 27, 2024, 3:46 p.m. UTC | #5
On 3/25/2024 11:34 AM, Sowmiya Sree Elavalagan wrote:
> In case of firmware assert snapshot of firmware memory is essential for
> debugging. Add firmware coredump collection support for PCI bus.
> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
> format and also pack various memory shared during QMI phase in separate
> TLVs.  Add necessary header and share the dumps to user space using dev
> coredump framework. Coredump collection is disabled by default and can
> be enabled using menuconfig. Dump collected for a radio is 55 MB
> approximately.
> 
> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
> 
> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
> ---
> v2:
>    - Fixed errors shown by ath12k-check
> v3:
>    - Fixed SPDX comment style for coredump.c file
>      Changed Kconfig description.
> ---
>   drivers/net/wireless/ath/ath12k/Kconfig    |  10 ++
>   drivers/net/wireless/ath/ath12k/Makefile   |   1 +
>   drivers/net/wireless/ath/ath12k/core.c     |   2 +
>   drivers/net/wireless/ath/ath12k/core.h     |   5 +
>   drivers/net/wireless/ath/ath12k/coredump.c |  51 ++++++
>   drivers/net/wireless/ath/ath12k/coredump.h |  80 +++++++++
>   drivers/net/wireless/ath/ath12k/hif.h      |   9 +-
>   drivers/net/wireless/ath/ath12k/hw.c       |   4 +-
>   drivers/net/wireless/ath/ath12k/mhi.c      |   5 +
>   drivers/net/wireless/ath/ath12k/mhi.h      |   4 +-
>   drivers/net/wireless/ath/ath12k/pci.c      | 185 +++++++++++++++++++++
>   11 files changed, 351 insertions(+), 5 deletions(-)
>   create mode 100644 drivers/net/wireless/ath/ath12k/coredump.c
>   create mode 100644 drivers/net/wireless/ath/ath12k/coredump.h
> 


> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
> index 391b6fb2bd42..f0cc4959faf5 100644
> --- a/drivers/net/wireless/ath/ath12k/core.c
> +++ b/drivers/net/wireless/ath/ath12k/core.c
> @@ -1121,6 +1121,7 @@ static void ath12k_core_reset(struct work_struct *work)
>   	reinit_completion(&ab->recovery_start);
>   	atomic_set(&ab->recovery_count, 0);
>   
> +	ath12k_coredump_collect(ab);

1. Can we confirm if the operation guarantees that the coredump has been 
successfully uploaded upon its completion? It would be helpful to 
understand the expected behavior of the system in the event of a 
successful or unsuccessful upload.

2. Is it safe to proceed with memory cleanup operations immediately 
after this call? Understanding the dependencies and potential risks 
associated with memory cleanup post-upload would be beneficial for 
maintaining system stability and avoiding potential issues.

>   	ath12k_core_pre_reconfigure_recovery(ab);
>   
>   	reinit_completion(&ab->reconfigure_complete);
> @@ -1220,6 +1221,7 @@ struct ath12k_base *ath12k_core_alloc(struct device *dev, size_t priv_size,
>   	INIT_WORK(&ab->restart_work, ath12k_core_restart);
>   	INIT_WORK(&ab->reset_work, ath12k_core_reset);
>   	INIT_WORK(&ab->rfkill_work, ath12k_rfkill_work);
> +	INIT_WORK(&ab->dump_work, ath12k_coredump_upload);
>   
>   	timer_setup(&ab->rx_replenish_retry, ath12k_ce_rx_replenish_retry, 0);
>   	init_completion(&ab->htc_suspend);
Sowmiya Sree Elavalagan July 1, 2024, 1:06 p.m. UTC | #6
On 6/27/2024 9:16 PM, Sidhanta Sahu wrote:
> 
> 
> On 3/25/2024 11:34 AM, Sowmiya Sree Elavalagan wrote:
>> In case of firmware assert snapshot of firmware memory is essential for
>> debugging. Add firmware coredump collection support for PCI bus.
>> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
>> format and also pack various memory shared during QMI phase in separate
>> TLVs.  Add necessary header and share the dumps to user space using dev
>> coredump framework. Coredump collection is disabled by default and can
>> be enabled using menuconfig. Dump collected for a radio is 55 MB
>> approximately.
>>
>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
>>
>> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
>> ---
>> v2:
>>    - Fixed errors shown by ath12k-check
>> v3:
>>    - Fixed SPDX comment style for coredump.c file
>>      Changed Kconfig description.
>> ---
>>   drivers/net/wireless/ath/ath12k/Kconfig    |  10 ++
>>   drivers/net/wireless/ath/ath12k/Makefile   |   1 +
>>   drivers/net/wireless/ath/ath12k/core.c     |   2 +
>>   drivers/net/wireless/ath/ath12k/core.h     |   5 +
>>   drivers/net/wireless/ath/ath12k/coredump.c |  51 ++++++
>>   drivers/net/wireless/ath/ath12k/coredump.h |  80 +++++++++
>>   drivers/net/wireless/ath/ath12k/hif.h      |   9 +-
>>   drivers/net/wireless/ath/ath12k/hw.c       |   4 +-
>>   drivers/net/wireless/ath/ath12k/mhi.c      |   5 +
>>   drivers/net/wireless/ath/ath12k/mhi.h      |   4 +-
>>   drivers/net/wireless/ath/ath12k/pci.c      | 185 +++++++++++++++++++++
>>   11 files changed, 351 insertions(+), 5 deletions(-)
>>   create mode 100644 drivers/net/wireless/ath/ath12k/coredump.c
>>   create mode 100644 drivers/net/wireless/ath/ath12k/coredump.h
>>
> 
> 
>> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
>> index 391b6fb2bd42..f0cc4959faf5 100644
>> --- a/drivers/net/wireless/ath/ath12k/core.c
>> +++ b/drivers/net/wireless/ath/ath12k/core.c
>> @@ -1121,6 +1121,7 @@ static void ath12k_core_reset(struct work_struct *work)
>>       reinit_completion(&ab->recovery_start);
>>       atomic_set(&ab->recovery_count, 0);
>>   +    ath12k_coredump_collect(ab);
> 
> 1. Can we confirm if the operation guarantees that the coredump has been successfully uploaded upon its completion? It would be helpful to understand the expected behavior of the system in the event of a successful or unsuccessful upload.
> 
  You would see the print "Uploading coredump" on the console upon successful upload of the coredump. In case if the dump collection fails on the driver side itself you would not see the debug message.

> 2. Is it safe to proceed with memory cleanup operations immediately after this call? Understanding the dependencies and potential risks associated with memory cleanup post-upload would be beneficial for maintaining system stability and avoiding potential issues.
> 
  If you see the implementation completely, we take a copy of the memory to be dumped in the above call and queue the work of uploading to user space. All this happens before we proceed with clean up.  We cannot wait for clean up until coredump gets uploaded as it may delay the recovery process which is not accepted.

>>       ath12k_core_pre_reconfigure_recovery(ab);
>>         reinit_completion(&ab->reconfigure_complete);
>> @@ -1220,6 +1221,7 @@ struct ath12k_base *ath12k_core_alloc(struct device *dev, size_t priv_size,
>>       INIT_WORK(&ab->restart_work, ath12k_core_restart);
>>       INIT_WORK(&ab->reset_work, ath12k_core_reset);
>>       INIT_WORK(&ab->rfkill_work, ath12k_rfkill_work);
>> +    INIT_WORK(&ab->dump_work, ath12k_coredump_upload);
>>         timer_setup(&ab->rx_replenish_retry, ath12k_ce_rx_replenish_retry, 0);
>>       init_completion(&ab->htc_suspend);

Hi Sidhanta,

Please find my response inline.

Thanks,
Sowmiya Sree
Sowmiya Sree Elavalagan July 3, 2024, 12:18 p.m. UTC | #7
On 6/26/2024 8:39 PM, Kalle Valo wrote:
> Kalle Valo <kvalo@kernel.org> writes:
> 
>> Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com> wrote:
>>
>>> In case of firmware assert snapshot of firmware memory is essential for
>>> debugging. Add firmware coredump collection support for PCI bus.
>>> Collect RDDM and firmware paging dumps from MHI and pack them in TLV
>>> format and also pack various memory shared during QMI phase in separate
>>> TLVs.  Add necessary header and share the dumps to user space using dev
>>> coredump framework. Coredump collection is disabled by default and can
>>> be enabled using menuconfig. Dump collected for a radio is 55 MB
>>> approximately.
>>>
>>> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.2.1-00201-QCAHKSWPL_SILICONZ-1
>>>
>>> Signed-off-by: Sowmiya Sree Elavalagan <quic_ssreeela@quicinc.com>
>>> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
>>> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
>>
>> This didn't compile for me, I added this to pci.c:
>>
>> +#include <linux/vmalloc.h>
>>
>> Also in the pending branch I made some whitespace in struct ath12k_dump_file_data:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=44ae07628b68375f476895f4fc1e89a570790ac0
>>
>> Any tips how to test this until we have the debugfs interface to crash the firmware?
> 
> I was able to get patch 'wifi: ath12k: Add support to simulate firmware
> crash' (not yet public) and did a quick test with it. There seems to be
> a KASAN warning but I can't debug this further at this time.
> 
> [ 8091.304272] ath12k_pci 0000:06:00.0: simulating firmware assert crash
> [ 8091.722245] ==================================================================
> [ 8091.722329] BUG: KASAN: vmalloc-out-of-bounds in ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.722433] Write of size 4 at addr ffffc9000644b28c by task kworker/u32:0/11
> [ 8091.722517] 
> [ 8091.722552] CPU: 0 PID: 11 Comm: kworker/u32:0 Not tainted 6.10.0-rc4-wt-ath+ #1663
> [ 8091.722604] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
> [ 8091.722670] Workqueue: ath12k_aux_wq ath12k_core_reset [ath12k]
> [ 8091.722742] Call Trace:
> [ 8091.722778]  <TASK>
> [ 8091.722832]  dump_stack_lvl+0x7d/0xe0
> [ 8091.722920]  print_address_description.constprop.0+0x33/0x3a0
> [ 8091.722999]  print_report+0xb5/0x260
> [ 8091.723069]  ? kasan_addr_to_slab+0xd/0x80
> [ 8091.723146]  kasan_report+0xd8/0x110
> [ 8091.723217]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723301]  ? ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723386]  __asan_report_store_n_noabort+0x12/0x20
> [ 8091.723461]  ath12k_pci_coredump_download+0x1071/0x1330 [ath12k]
> [ 8091.723563]  ? ath12k_pci_coredump_calculate_size+0x730/0x730 [ath12k]
> [ 8091.723632]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.723677]  ath12k_coredump_collect+0x60/0x73 [ath12k]
> [ 8091.724276]  ath12k_core_reset+0x1b1/0x880 [ath12k]
> [ 8091.724921]  ? _raw_spin_unlock_irq+0x22/0x50
> [ 8091.725503]  ? __this_cpu_preempt_check+0x13/0x20
> [ 8091.726126]  process_one_work+0x8d7/0x19f0
> [ 8091.726718]  ? pwq_dec_nr_in_flight+0x580/0x580
> [ 8091.727346]  ? move_linked_works+0x128/0x2c0
> [ 8091.727998]  ? assign_work+0x15e/0x270
> [ 8091.728601]  worker_thread+0x715/0x1270
> [ 8091.729244]  ? rescuer_thread+0xdb0/0xdb0
> [ 8091.729905]  kthread+0x2fa/0x3f0
> [ 8091.730520]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.731192]  ret_from_fork+0x31/0x70
> [ 8091.731856]  ? kthread_insert_work_sanity_check+0xd0/0xd0
> [ 8091.732525]  ret_from_fork_asm+0x11/0x20
> [ 8091.733212]  </TASK>
> [ 8091.733909] 
> [ 8091.734559] The buggy address belongs to the virtual mapping at#012[ 8091.734559]  [ffffc9000500b000, ffffc9000644d000) created by:#012[ 8091.734559]  ath12k_pci_coredump_download+0x147/0x1330 [ath12k]
> [ 8091.736558] 
> [ 8091.737272] The buggy address belongs to the physical page:
> [ 8091.738016] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x15a485
> [ 8091.738730] flags: 0x200000000000000(node=0|zone=2)
> [ 8091.739481] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
> [ 8091.740256] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> [ 8091.741043] page dumped because: kasan: bad access detected
> [ 8091.741786] 
> [ 8091.742529] Memory state around the buggy address:
> [ 8091.743296]  ffffc9000644b180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744087]  ffffc9000644b200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 8091.744834] >ffffc9000644b280: 00 04 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.745598]                       ^
> [ 8091.746359]  ffffc9000644b300: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747152]  ffffc9000644b380: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [ 8091.747932] ==================================================================
> [ 8091.748688] Disabling lock debugging due to kernel taint
> [ 8091.749699] ath12k_pci 0000:06:00.0: Uploading coredump
> 

Hi Kalle,

Can you please share the configs to reproduce the above KASAN issue in my local setup. I tried with below KASAN configs enabled, but I couldn't reproduce the above warning in my x86 setup.

CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
CONFIG_KASAN_EXTRA_INFO=y

Thanks,
Sowmiya Sree
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath12k/Kconfig b/drivers/net/wireless/ath/ath12k/Kconfig
index e135d2b1b61d..22d9634807f7 100644
--- a/drivers/net/wireless/ath/ath12k/Kconfig
+++ b/drivers/net/wireless/ath/ath12k/Kconfig
@@ -32,3 +32,13 @@  config ATH12K_TRACING
 
 	  If unsure, say Y to make it easier to debug problems. But if
 	  you want optimal performance choose N.
+
+config ATH12K_COREDUMP
+	bool "ath12k coredump"
+	depends on ATH12K
+	select WANT_DEV_COREDUMP
+	help
+	  Enable ath12k coredump collection
+
+	  If unsure, say Y to make it easier to debug problems. But if
+	  dump collection not required choose N.
diff --git a/drivers/net/wireless/ath/ath12k/Makefile b/drivers/net/wireless/ath/ath12k/Makefile
index 71669f94ff75..41c584acc582 100644
--- a/drivers/net/wireless/ath/ath12k/Makefile
+++ b/drivers/net/wireless/ath/ath12k/Makefile
@@ -24,6 +24,7 @@  ath12k-y += core.o \
 	    p2p.o
 
 ath12k-$(CONFIG_ATH12K_TRACING) += trace.o
+ath12k-$(CONFIG_ATH12K_COREDUMP) += coredump.o
 
 # for tracing framework to find trace.h
 CFLAGS_trace.o := -I$(src)
diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
index 391b6fb2bd42..f0cc4959faf5 100644
--- a/drivers/net/wireless/ath/ath12k/core.c
+++ b/drivers/net/wireless/ath/ath12k/core.c
@@ -1121,6 +1121,7 @@  static void ath12k_core_reset(struct work_struct *work)
 	reinit_completion(&ab->recovery_start);
 	atomic_set(&ab->recovery_count, 0);
 
+	ath12k_coredump_collect(ab);
 	ath12k_core_pre_reconfigure_recovery(ab);
 
 	reinit_completion(&ab->reconfigure_complete);
@@ -1220,6 +1221,7 @@  struct ath12k_base *ath12k_core_alloc(struct device *dev, size_t priv_size,
 	INIT_WORK(&ab->restart_work, ath12k_core_restart);
 	INIT_WORK(&ab->reset_work, ath12k_core_reset);
 	INIT_WORK(&ab->rfkill_work, ath12k_rfkill_work);
+	INIT_WORK(&ab->dump_work, ath12k_coredump_upload);
 
 	timer_setup(&ab->rx_replenish_retry, ath12k_ce_rx_replenish_retry, 0);
 	init_completion(&ab->htc_suspend);
diff --git a/drivers/net/wireless/ath/ath12k/core.h b/drivers/net/wireless/ath/ath12k/core.h
index 97e5a0ccd233..736958989794 100644
--- a/drivers/net/wireless/ath/ath12k/core.h
+++ b/drivers/net/wireless/ath/ath12k/core.h
@@ -26,6 +26,7 @@ 
 #include "reg.h"
 #include "dbring.h"
 #include "fw.h"
+#include "coredump.h"
 
 #define SM(_v, _f) (((_v) << _f##_LSB) & _f##_MASK)
 
@@ -700,6 +701,10 @@  struct ath12k_base {
 	/* HW channel counters frequency value in hertz common to all MACs */
 	u32 cc_freq_hz;
 
+	struct ath12k_dump_file_data *dump_data;
+	size_t ath12k_coredump_len;
+	struct work_struct dump_work;
+
 	struct ath12k_htc htc;
 
 	struct ath12k_dp dp;
diff --git a/drivers/net/wireless/ath/ath12k/coredump.c b/drivers/net/wireless/ath/ath12k/coredump.c
new file mode 100644
index 000000000000..72d675d15e64
--- /dev/null
+++ b/drivers/net/wireless/ath/ath12k/coredump.c
@@ -0,0 +1,51 @@ 
+// SPDX-License-Identifier: BSD-3-Clause-Clear
+/*
+ * Copyright (c) 2020 The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#include <linux/devcoredump.h>
+#include "hif.h"
+#include "coredump.h"
+#include "debug.h"
+
+enum
+ath12k_fw_crash_dump_type ath12k_coredump_get_dump_type(enum ath12k_qmi_target_mem type)
+{
+	enum ath12k_fw_crash_dump_type dump_type;
+
+	switch (type) {
+	case HOST_DDR_REGION_TYPE:
+		dump_type = FW_CRASH_DUMP_REMOTE_MEM_DATA;
+		break;
+	case M3_DUMP_REGION_TYPE:
+		dump_type = FW_CRASH_DUMP_M3_DUMP;
+		break;
+	case PAGEABLE_MEM_REGION_TYPE:
+		dump_type = FW_CRASH_DUMP_PAGEABLE_DATA;
+		break;
+	case BDF_MEM_REGION_TYPE:
+	case CALDB_MEM_REGION_TYPE:
+		dump_type = FW_CRASH_DUMP_NONE;
+		break;
+	default:
+		dump_type = FW_CRASH_DUMP_TYPE_MAX;
+		break;
+	}
+
+	return dump_type;
+}
+
+void ath12k_coredump_upload(struct work_struct *work)
+{
+	struct ath12k_base *ab = container_of(work, struct ath12k_base, dump_work);
+
+	ath12k_info(ab, "Uploading coredump\n");
+	/* dev_coredumpv() takes ownership of the buffer */
+	dev_coredumpv(ab->dev, ab->dump_data, ab->ath12k_coredump_len, GFP_KERNEL);
+	ab->dump_data = NULL;
+}
+
+void ath12k_coredump_collect(struct ath12k_base *ab)
+{
+	ath12k_hif_coredump_download(ab);
+}
diff --git a/drivers/net/wireless/ath/ath12k/coredump.h b/drivers/net/wireless/ath/ath12k/coredump.h
new file mode 100644
index 000000000000..5d6003b1c12d
--- /dev/null
+++ b/drivers/net/wireless/ath/ath12k/coredump.h
@@ -0,0 +1,80 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause-Clear */
+/*
+ * Copyright (c) 2020 The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef _ATH12K_COREDUMP_H_
+#define _ATH12K_COREDUMP_H_
+
+#define ATH12K_FW_CRASH_DUMP_V2      2
+
+enum ath12k_fw_crash_dump_type {
+	FW_CRASH_DUMP_PAGING_DATA,
+	FW_CRASH_DUMP_RDDM_DATA,
+	FW_CRASH_DUMP_REMOTE_MEM_DATA,
+	FW_CRASH_DUMP_PAGEABLE_DATA,
+	FW_CRASH_DUMP_M3_DUMP,
+	FW_CRASH_DUMP_NONE,
+
+	/* keep last */
+	FW_CRASH_DUMP_TYPE_MAX,
+};
+
+#define COREDUMP_TLV_HDR_SIZE 8
+
+struct ath12k_tlv_dump_data {
+	/* see ath11k_fw_crash_dump_type above */
+	__le32 type;
+
+	/* in bytes */
+	__le32 tlv_len;
+
+	/* pad to 32-bit boundaries as needed */
+	u8 tlv_data[];
+} __packed;
+
+struct ath12k_dump_file_data {
+	/* "ATH12K-FW-DUMP" */
+	char df_magic[16];
+	/* total dump len in bytes */
+	__le32 len;
+	/* file dump version */
+	__le32 version;
+	/* pci device id */
+	__le32 chip_id;
+	/* qrtr instance id */
+	__le32 qrtr_id;
+	/* pci domain id */
+	__le32 bus_id;
+	guid_t guid;
+	/* time-of-day stamp */
+	__le64 tv_sec;
+	/* time-of-day stamp, nano-seconds */
+	__le64 tv_nsec;
+	/* room for growth w/out changing binary format */
+	u8 unused[128];
+	u8 data[];
+} __packed;
+
+#ifdef CONFIG_ATH12K_COREDUMP
+enum ath12k_fw_crash_dump_type ath12k_coredump_get_dump_type
+						(enum ath12k_qmi_target_mem type);
+void ath12k_coredump_upload(struct work_struct *work);
+void ath12k_coredump_collect(struct ath12k_base *ab);
+#else
+static inline enum ath12k_fw_crash_dump_type ath12k_coredump_get_dump_type
+							(enum ath12k_qmi_target_mem type)
+{
+	return FW_CRASH_DUMP_TYPE_MAX;
+}
+
+static inline void ath12k_coredump_upload(struct work_struct *work)
+{
+}
+
+static inline void ath12k_coredump_collect(struct ath12k_base *ab)
+{
+}
+#endif
+
+#endif
diff --git a/drivers/net/wireless/ath/ath12k/hif.h b/drivers/net/wireless/ath/ath12k/hif.h
index c653ca1f59b2..94fb8a9dbdf3 100644
--- a/drivers/net/wireless/ath/ath12k/hif.h
+++ b/drivers/net/wireless/ath/ath12k/hif.h
@@ -1,7 +1,7 @@ 
 /* SPDX-License-Identifier: BSD-3-Clause-Clear */
 /*
  * Copyright (c) 2019-2021 The Linux Foundation. All rights reserved.
- * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved.
  */
 
 #ifndef ATH12K_HIF_H
@@ -30,6 +30,7 @@  struct ath12k_hif_ops {
 	void (*ce_irq_enable)(struct ath12k_base *ab);
 	void (*ce_irq_disable)(struct ath12k_base *ab);
 	void (*get_ce_msi_idx)(struct ath12k_base *ab, u32 ce_id, u32 *msi_idx);
+	void (*coredump_download)(struct ath12k_base *ab);
 };
 
 static inline int ath12k_hif_map_service_to_pipe(struct ath12k_base *ab, u16 service_id,
@@ -141,4 +142,10 @@  static inline void ath12k_hif_power_down(struct ath12k_base *ab)
 	ab->hif.ops->power_down(ab);
 }
 
+static inline void ath12k_hif_coredump_download(struct ath12k_base *ab)
+{
+	if (ab->hif.ops->coredump_download)
+		ab->hif.ops->coredump_download(ab);
+}
+
 #endif /* ATH12K_HIF_H */
diff --git a/drivers/net/wireless/ath/ath12k/hw.c b/drivers/net/wireless/ath/ath12k/hw.c
index 0b17dfd47856..6be5ce39ff8f 100644
--- a/drivers/net/wireless/ath/ath12k/hw.c
+++ b/drivers/net/wireless/ath/ath12k/hw.c
@@ -912,7 +912,7 @@  static const struct ath12k_hw_params ath12k_hw_params[] = {
 		.rfkill_cfg = 0,
 		.rfkill_on_level = 0,
 
-		.rddm_size = 0,
+		.rddm_size = 0x600000,
 
 		.def_num_link = 0,
 		.max_mlo_peer = 256,
@@ -1053,7 +1053,7 @@  static const struct ath12k_hw_params ath12k_hw_params[] = {
 		.rfkill_cfg = 0,
 		.rfkill_on_level = 0,
 
-		.rddm_size = 0,
+		.rddm_size = 0x600000,
 
 		.def_num_link = 0,
 		.max_mlo_peer = 256,
diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c
index adb8c3ec1950..2642eae40cf2 100644
--- a/drivers/net/wireless/ath/ath12k/mhi.c
+++ b/drivers/net/wireless/ath/ath12k/mhi.c
@@ -668,3 +668,8 @@  void ath12k_mhi_resume(struct ath12k_pci *ab_pci)
 {
 	ath12k_mhi_set_state(ab_pci, ATH12K_MHI_RESUME);
 }
+
+void ath12k_mhi_coredump(struct mhi_controller *mhi_ctrl, bool in_panic)
+{
+	mhi_download_rddm_image(mhi_ctrl, in_panic);
+}
diff --git a/drivers/net/wireless/ath/ath12k/mhi.h b/drivers/net/wireless/ath/ath12k/mhi.h
index ebc23640ce7a..3c9268824ca5 100644
--- a/drivers/net/wireless/ath/ath12k/mhi.h
+++ b/drivers/net/wireless/ath/ath12k/mhi.h
@@ -1,7 +1,7 @@ 
 /* SPDX-License-Identifier: BSD-3-Clause-Clear */
 /*
  * Copyright (c) 2020-2021 The Linux Foundation. All rights reserved.
- * Copyright (c) 2021-2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ * Copyright (c) 2021-2022, 2024 Qualcomm Innovation Center, Inc. All rights reserved.
  */
 #ifndef _ATH12K_MHI_H
 #define _ATH12K_MHI_H
@@ -42,5 +42,5 @@  void ath12k_mhi_clear_vector(struct ath12k_base *ab);
 
 void ath12k_mhi_suspend(struct ath12k_pci *ar_pci);
 void ath12k_mhi_resume(struct ath12k_pci *ar_pci);
-
+void ath12k_mhi_coredump(struct mhi_controller *mhi_ctrl, bool in_panic);
 #endif
diff --git a/drivers/net/wireless/ath/ath12k/pci.c b/drivers/net/wireless/ath/ath12k/pci.c
index 14954bc05144..5f45c77fda38 100644
--- a/drivers/net/wireless/ath/ath12k/pci.c
+++ b/drivers/net/wireless/ath/ath12k/pci.c
@@ -7,6 +7,7 @@ 
 #include <linux/module.h>
 #include <linux/msi.h>
 #include <linux/pci.h>
+#include <linux/time.h>
 
 #include "pci.h"
 #include "core.h"
@@ -1240,6 +1241,186 @@  void ath12k_pci_write32(struct ath12k_base *ab, u32 offset, u32 value)
 		ab_pci->pci_ops->release(ab);
 }
 
+#ifdef CONFIG_ATH12K_COREDUMP
+static int ath12k_pci_coredump_calculate_size(struct ath12k_base *ab, u32 *dump_seg_sz)
+{
+	struct ath12k_pci *ab_pci = ath12k_pci_priv(ab);
+	struct mhi_controller *mhi_ctrl = ab_pci->mhi_ctrl;
+	struct image_info *rddm_img, *fw_img;
+	struct ath12k_tlv_dump_data *dump_tlv;
+	enum ath12k_fw_crash_dump_type mem_type;
+	u32 len = 0, rddm_tlv_sz = 0, paging_tlv_sz = 0;
+	struct ath12k_dump_file_data *file_data;
+	int i;
+
+	rddm_img = mhi_ctrl->rddm_image;
+	if (!rddm_img) {
+		ath12k_err(ab, "No RDDM dump found\n");
+		return 0;
+	}
+
+	fw_img = mhi_ctrl->fbc_image;
+
+	for (i = 0; i < fw_img->entries ; i++) {
+		if (!fw_img->mhi_buf[i].buf)
+			continue;
+
+		paging_tlv_sz += fw_img->mhi_buf[i].len;
+	}
+	dump_seg_sz[FW_CRASH_DUMP_PAGING_DATA] = paging_tlv_sz;
+
+	for (i = 0; i < rddm_img->entries; i++) {
+		if (!rddm_img->mhi_buf[i].buf)
+			continue;
+
+		rddm_tlv_sz += rddm_img->mhi_buf[i].len;
+	}
+	dump_seg_sz[FW_CRASH_DUMP_RDDM_DATA] = rddm_tlv_sz;
+
+	for (i = 0; i < ab->qmi.mem_seg_count; i++) {
+		mem_type = ath12k_coredump_get_dump_type(ab->qmi.target_mem[i].type);
+
+		if (mem_type == FW_CRASH_DUMP_NONE)
+			continue;
+
+		if (mem_type == FW_CRASH_DUMP_TYPE_MAX) {
+			ath12k_dbg(ab, ATH12K_DBG_PCI,
+				   "target mem region type %d not supported",
+				   ab->qmi.target_mem[i].type);
+			continue;
+		}
+
+		if (!ab->qmi.target_mem[i].paddr)
+			continue;
+
+		dump_seg_sz[mem_type] += ab->qmi.target_mem[i].size;
+	}
+
+	for (i = 0; i < FW_CRASH_DUMP_TYPE_MAX; i++) {
+		if (!dump_seg_sz[i])
+			continue;
+
+		len += sizeof(*dump_tlv) + dump_seg_sz[i];
+	}
+
+	if (len)
+		len += sizeof(*file_data);
+
+	return len;
+}
+
+static void ath12k_pci_coredump_download(struct ath12k_base *ab)
+{
+	struct ath12k_pci *ab_pci = ath12k_pci_priv(ab);
+	struct mhi_controller *mhi_ctrl = ab_pci->mhi_ctrl;
+	struct image_info *rddm_img, *fw_img;
+	struct timespec64 timestamp;
+	int i, len, mem_idx;
+	enum ath12k_fw_crash_dump_type mem_type;
+	struct ath12k_dump_file_data *file_data;
+	struct ath12k_tlv_dump_data *dump_tlv;
+	size_t hdr_len = sizeof(*file_data);
+	void *buf;
+	u32 dump_seg_sz[FW_CRASH_DUMP_TYPE_MAX] = { 0 };
+
+	ath12k_mhi_coredump(mhi_ctrl, false);
+
+	len = ath12k_pci_coredump_calculate_size(ab, dump_seg_sz);
+	if (!len) {
+		ath12k_warn(ab, "No crash dump data found for devcoredump");
+		return;
+	}
+
+	rddm_img = mhi_ctrl->rddm_image;
+	fw_img = mhi_ctrl->fbc_image;
+
+	/* dev_coredumpv() requires vmalloc data */
+	buf = vzalloc(len);
+	if (!buf)
+		return;
+
+	ab->dump_data = buf;
+	ab->ath12k_coredump_len = len;
+	file_data = ab->dump_data;
+	strscpy(file_data->df_magic, "ATH12K-FW-DUMP", sizeof(file_data->df_magic));
+	file_data->len = cpu_to_le32(len);
+	file_data->version = cpu_to_le32(ATH12K_FW_CRASH_DUMP_V2);
+	file_data->chip_id = cpu_to_le32(ab_pci->dev_id);
+	file_data->qrtr_id = cpu_to_le32(ab_pci->ab->qmi.service_ins_id);
+	file_data->bus_id = cpu_to_le32(pci_domain_nr(ab_pci->pdev->bus));
+	guid_gen(&file_data->guid);
+	ktime_get_real_ts64(&timestamp);
+	file_data->tv_sec = cpu_to_le64(timestamp.tv_sec);
+	file_data->tv_nsec = cpu_to_le64(timestamp.tv_nsec);
+	buf += hdr_len;
+	dump_tlv = buf;
+	dump_tlv->type = cpu_to_le32(FW_CRASH_DUMP_PAGING_DATA);
+	dump_tlv->tlv_len = cpu_to_le32(dump_seg_sz[FW_CRASH_DUMP_PAGING_DATA]);
+	buf += COREDUMP_TLV_HDR_SIZE;
+
+	/* append all segments together as they are all part of a single contiguous
+	 * block of memory
+	 */
+	for (i = 0; i < fw_img->entries ; i++) {
+		if (!fw_img->mhi_buf[i].buf)
+			continue;
+
+		memcpy_fromio(buf, (void const __iomem *)fw_img->mhi_buf[i].buf,
+			      fw_img->mhi_buf[i].len);
+		buf += fw_img->mhi_buf[i].len;
+	}
+
+	dump_tlv = buf;
+	dump_tlv->type = cpu_to_le32(FW_CRASH_DUMP_RDDM_DATA);
+	dump_tlv->tlv_len = cpu_to_le32(dump_seg_sz[FW_CRASH_DUMP_RDDM_DATA]);
+	buf += COREDUMP_TLV_HDR_SIZE;
+
+	/* append all segments together as they are all part of a single contiguous
+	 * block of memory
+	 */
+	for (i = 0; i < rddm_img->entries; i++) {
+		if (!rddm_img->mhi_buf[i].buf)
+			continue;
+
+		memcpy_fromio(buf, (void const __iomem *)rddm_img->mhi_buf[i].buf,
+			      rddm_img->mhi_buf[i].len);
+		buf += rddm_img->mhi_buf[i].len;
+	}
+
+	mem_idx = FW_CRASH_DUMP_REMOTE_MEM_DATA;
+	for (; mem_idx < FW_CRASH_DUMP_TYPE_MAX; mem_idx++) {
+		if (mem_idx == FW_CRASH_DUMP_NONE)
+			continue;
+
+		dump_tlv = buf;
+		dump_tlv->type = cpu_to_le32(mem_idx);
+		dump_tlv->tlv_len = cpu_to_le32(dump_seg_sz[mem_idx]);
+		buf += COREDUMP_TLV_HDR_SIZE;
+
+		for (i = 0; i < ab->qmi.mem_seg_count; i++) {
+			mem_type = ath12k_coredump_get_dump_type
+							(ab->qmi.target_mem[i].type);
+
+			if (mem_type != mem_idx)
+				continue;
+
+			if (!ab->qmi.target_mem[i].paddr) {
+				ath12k_dbg(ab, ATH12K_DBG_PCI,
+					   "Skipping mem region type %d",
+					   ab->qmi.target_mem[i].type);
+				continue;
+			}
+
+			memcpy_fromio(buf, ab->qmi.target_mem[i].v.ioaddr,
+				      ab->qmi.target_mem[i].size);
+			buf += ab->qmi.target_mem[i].size;
+		}
+	}
+
+	queue_work(ab->workqueue, &ab->dump_work);
+}
+#endif
+
 int ath12k_pci_power_up(struct ath12k_base *ab)
 {
 	struct ath12k_pci *ab_pci = ath12k_pci_priv(ab);
@@ -1302,6 +1483,9 @@  static const struct ath12k_hif_ops ath12k_pci_hif_ops = {
 	.ce_irq_enable = ath12k_pci_hif_ce_irq_enable,
 	.ce_irq_disable = ath12k_pci_hif_ce_irq_disable,
 	.get_ce_msi_idx = ath12k_pci_get_ce_msi_idx,
+#ifdef CONFIG_ATH12K_COREDUMP
+	.coredump_download = ath12k_pci_coredump_download,
+#endif
 };
 
 static
@@ -1511,6 +1695,7 @@  static void ath12k_pci_remove(struct pci_dev *pdev)
 	set_bit(ATH12K_FLAG_UNREGISTERING, &ab->dev_flags);
 
 	cancel_work_sync(&ab->reset_work);
+	cancel_work_sync(&ab->dump_work);
 	ath12k_core_deinit(ab);
 
 qmi_fail: