Message ID | 1676978713-7394-1-git-send-email-quic_mojha@quicinc.com (mailing list archive) |
---|---|
Headers | show |
Series | Add basic Minidump kernel driver support | expand |
On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote: > Minidump is a best effort mechanism to collect useful and predefined data > for first level of debugging on end user devices running on Qualcomm SoCs. > It is built on the premise that System on Chip (SoC) or subsystem part of > SoC crashes, due to a range of hardware and software bugs. Hence, the > ability to collect accurate data is only a best-effort. The data collected > could be invalid or corrupted, data collection itself could fail, and so on. > > Qualcomm devices in engineering mode provides a mechanism for generating > full system ramdumps for post mortem debugging. But in some cases it's > however not feasible to capture the entire content of RAM. The minidump > mechanism provides the means for selecting which snippets should be > included in the ramdump. > > The core of minidump feature is part of Qualcomm's boot firmware code. > It initializes shared memory (SMEM), which is a part of DDR and > allocates a small section of SMEM to minidump table i.e also called > global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has > their own table of segments to be included in the minidump and all get > their reference from G-ToC. Each segment/region has some details like > name, physical address and it's size etc. and it could be anywhere > scattered in the DDR. > > Existing upstream Qualcomm remoteproc driver[1] already supports minidump > feature for remoteproc instances like ADSP, MODEM, ... where predefined > selective segments of subsystem region can be dumped as part of > coredump collection which generates smaller size artifacts compared to > complete coredump of subsystem on crash. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142 > > In addition to managing and querying the APSS minidump description, > the Linux driver maintains a ELF header in a segment. This segment > gets updated with section/program header whenever a new entry gets > registered. I'd like to test this series plus your series that sets the multiple download modes. Can you include documentation about how to actually use this new feature? Also the information that you provided above is really useful. I think that should also go in the documentation file as well. I already have a reliable way to make a board go BOOM and go into ramdump mode. Brian
Thanks Brian for your interest in this series. On 2/23/2023 6:07 PM, Brian Masney wrote: > On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote: >> Minidump is a best effort mechanism to collect useful and predefined data >> for first level of debugging on end user devices running on Qualcomm SoCs. >> It is built on the premise that System on Chip (SoC) or subsystem part of >> SoC crashes, due to a range of hardware and software bugs. Hence, the >> ability to collect accurate data is only a best-effort. The data collected >> could be invalid or corrupted, data collection itself could fail, and so on. >> >> Qualcomm devices in engineering mode provides a mechanism for generating >> full system ramdumps for post mortem debugging. But in some cases it's >> however not feasible to capture the entire content of RAM. The minidump >> mechanism provides the means for selecting which snippets should be >> included in the ramdump. >> >> The core of minidump feature is part of Qualcomm's boot firmware code. >> It initializes shared memory (SMEM), which is a part of DDR and >> allocates a small section of SMEM to minidump table i.e also called >> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has >> their own table of segments to be included in the minidump and all get >> their reference from G-ToC. Each segment/region has some details like >> name, physical address and it's size etc. and it could be anywhere >> scattered in the DDR. >> >> Existing upstream Qualcomm remoteproc driver[1] already supports minidump >> feature for remoteproc instances like ADSP, MODEM, ... where predefined >> selective segments of subsystem region can be dumped as part of >> coredump collection which generates smaller size artifacts compared to >> complete coredump of subsystem on crash. >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142 >> >> In addition to managing and querying the APSS minidump description, >> the Linux driver maintains a ELF header in a segment. This segment >> gets updated with section/program header whenever a new entry gets >> registered. > > I'd like to test this series plus your series that sets the multiple > download modes. Sure, you are welcome, but for that you need a device running with Qualcomm SoC and if it has a upstream support. Also, testing of this patch needs some minimal out of tree patches and i can help you with that. > Can you include documentation about how to actually use > this new feature? Will surely do, Since this is still RFC, and i am doubtful on the path of it in documentation directory. Also the information that you provided above is really > useful. I think that should also go in the documentation file as well. > > I already have a reliable way to make a board go BOOM and go into > ramdump mode. That's very nice to hear; but again if you can specify your target specification. -Mukesh > > Brian >
On 2/24/2023 2:40 AM, Mukesh Ojha wrote: > Thanks Brian for your interest in this series. > > On 2/23/2023 6:07 PM, Brian Masney wrote: >> On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote: >>> Minidump is a best effort mechanism to collect useful and predefined >>> data >>> for first level of debugging on end user devices running on Qualcomm >>> SoCs. >>> It is built on the premise that System on Chip (SoC) or subsystem >>> part of >>> SoC crashes, due to a range of hardware and software bugs. Hence, the >>> ability to collect accurate data is only a best-effort. The data >>> collected >>> could be invalid or corrupted, data collection itself could fail, and >>> so on. >>> >>> Qualcomm devices in engineering mode provides a mechanism for generating >>> full system ramdumps for post mortem debugging. But in some cases it's >>> however not feasible to capture the entire content of RAM. The minidump >>> mechanism provides the means for selecting which snippets should be >>> included in the ramdump. >>> >>> The core of minidump feature is part of Qualcomm's boot firmware code. >>> It initializes shared memory (SMEM), which is a part of DDR and >>> allocates a small section of SMEM to minidump table i.e also called >>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has >>> their own table of segments to be included in the minidump and all get >>> their reference from G-ToC. Each segment/region has some details like >>> name, physical address and it's size etc. and it could be anywhere >>> scattered in the DDR. >>> >>> Existing upstream Qualcomm remoteproc driver[1] already supports >>> minidump >>> feature for remoteproc instances like ADSP, MODEM, ... where predefined >>> selective segments of subsystem region can be dumped as part of >>> coredump collection which generates smaller size artifacts compared to >>> complete coredump of subsystem on crash. >>> >>> [1] >>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142 >>> >>> In addition to managing and querying the APSS minidump description, >>> the Linux driver maintains a ELF header in a segment. This segment >>> gets updated with section/program header whenever a new entry gets >>> registered. >> >> I'd like to test this series plus your series that sets the multiple >> download modes. > > Sure, you are welcome, but for that you need a device running with > Qualcomm SoC and if it has a upstream support. > > Also, testing of this patch needs some minimal out of tree patches and > i can help you with that. > >> Can you include documentation about how to actually use >> this new feature? > > Will surely do, Since this is still RFC, and i am doubtful on the path > of it in documentation directory. This is RFC anyways, you can start w/ the directory which you think best fits here. The point here is to have the documentation file rather than path to be fixed. You can start w/ Documentation/features/debug and let's see what others have any suggestion. Please add a file in your next revision without worrying about the path for now. ---Trilok Soni
Hi Mukesh, On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote: > On 2/23/2023 6:07 PM, Brian Masney wrote: > > I'd like to test this series plus your series that sets the multiple > > download modes. > > Sure, you are welcome, but for that you need a device running with Qualcomm > SoC and if it has a upstream support. I will be testing this series on a sa8540p (QDrive3 Automotive Development Board), which has the sc8280xp SoC with good upstream support. This is also the same board that I have a reliable way to make the board crash due to a known firmware bug. > Also, testing of this patch needs some minimal out of tree patches and > i can help you with that. Yup, that's fine. Hopefully we can also work to get those dependencies merged upstream as well. Brian
On 2/25/2023 12:36 AM, Brian Masney wrote: > Hi Mukesh, > > On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote: >> On 2/23/2023 6:07 PM, Brian Masney wrote: >>> I'd like to test this series plus your series that sets the multiple >>> download modes. >> >> Sure, you are welcome, but for that you need a device running with Qualcomm >> SoC and if it has a upstream support. > > I will be testing this series on a sa8540p (QDrive3 Automotive > Development Board), which has the sc8280xp SoC with good upstream > support. This is also the same board that I have a reliable way to > make the board crash due to a known firmware bug. > Can you try below patch to just select minidump download mode and make the device crash ? --------------------------------------->8------------------------------- diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi index 0d02599..bd8e1a8 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi @@ -280,6 +280,7 @@ firmware { scm: scm { compatible = "qcom,scm-sc8280xp", "qcom,scm"; + qcom,dload-mode = <&tcsr 0x13000>; }; }; diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c index cdbfe54..e1539a2 100644 --- a/drivers/firmware/qcom_scm.c +++ b/drivers/firmware/qcom_scm.c @@ -20,7 +20,7 @@ #include "qcom_scm.h" -static bool download_mode = IS_ENABLED(CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT); +static bool download_mode = true; module_param(download_mode, bool, 0); #define SCM_HAS_CORE_CLK BIT(0) @@ -427,7 +427,7 @@ static void qcom_scm_set_download_mode(bool enable) ret = __qcom_scm_set_dload_mode(__scm->dev, enable); } else if (__scm->dload_mode_addr) { ret = qcom_scm_io_writel(__scm->dload_mode_addr, - enable ? QCOM_SCM_BOOT_SET_DLOAD_MODE : 0); + enable ? 0x20 : 0); } else { dev_err(__scm->dev, "No available mechanism for setting download mode\n"); >> Also, testing of this patch needs some minimal out of tree patches and >> i can help you with that. > > Yup, that's fine. Hopefully we can also work to get those dependencies > merged upstream as well. > > Brian >
Friendly review reminder.. -Mukesh On 2/21/2023 4:55 PM, Mukesh Ojha wrote: > Minidump is a best effort mechanism to collect useful and predefined data > for first level of debugging on end user devices running on Qualcomm SoCs. > It is built on the premise that System on Chip (SoC) or subsystem part of > SoC crashes, due to a range of hardware and software bugs. Hence, the > ability to collect accurate data is only a best-effort. The data collected > could be invalid or corrupted, data collection itself could fail, and so on. > > Qualcomm devices in engineering mode provides a mechanism for generating > full system ramdumps for post mortem debugging. But in some cases it's > however not feasible to capture the entire content of RAM. The minidump > mechanism provides the means for selecting which snippets should be > included in the ramdump. > > The core of minidump feature is part of Qualcomm's boot firmware code. > It initializes shared memory (SMEM), which is a part of DDR and > allocates a small section of SMEM to minidump table i.e also called > global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has > their own table of segments to be included in the minidump and all get > their reference from G-ToC. Each segment/region has some details like > name, physical address and it's size etc. and it could be anywhere > scattered in the DDR. > > Existing upstream Qualcomm remoteproc driver[1] already supports minidump > feature for remoteproc instances like ADSP, MODEM, ... where predefined > selective segments of subsystem region can be dumped as part of > coredump collection which generates smaller size artifacts compared to > complete coredump of subsystem on crash. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142 > > In addition to managing and querying the APSS minidump description, > the Linux driver maintains a ELF header in a segment. This segment > gets updated with section/program header whenever a new entry gets > registered. > > Patch 1/6 is very trivial change. > Patch 2/6 moves the minidump specific data structure and macro to > qcom_minidump.h so that (3/6) minidump driver can use. > Patch 3/6 implements qualcomm minidump kernel driver and exports > symbol which other minidump kernel client can use. > Patch 4/6 enables the qualcomm minidump driver. > Patch 5/6 Use the exported symbol from minidump driver in qcom_common > for querying minidump descriptor for a subsystem. > Patch 6/6 Register pstore region with minidump. > > Testing of the patches has been done on sm8450 target with the help > of out of tree patch which helps to set the download mode and storage > type(on which dump will be saved) for which i will send separate series. > > Mukesh Ojha (6): > remoteproc: qcom: Expand MD_* as MINIDUMP_* > remoteproc: qcom: Move minidump specific data to qcom_minidump.h > soc: qcom: Add Qualcomm minidump kernel driver > arm64: defconfig: Enable Qualcomm minidump driver > remoterproc: qcom: refactor to leverage exported minidump symbol > pstore/ram: Register context with minidump > > arch/arm64/configs/defconfig | 1 + > drivers/remoteproc/qcom_common.c | 75 +----- > drivers/soc/qcom/Kconfig | 14 ++ > drivers/soc/qcom/Makefile | 1 + > drivers/soc/qcom/qcom_minidump.c | 490 +++++++++++++++++++++++++++++++++++++++++ > fs/pstore/ram.c | 77 ++++++ > include/soc/qcom/minidump.h | 40 ++++ > include/soc/qcom/qcom_minidump.h | 88 +++++++ > 8 files changed, 717 insertions(+), 69 deletions(-) > create mode 100644 drivers/soc/qcom/qcom_minidump.c > create mode 100644 include/soc/qcom/minidump.h > create mode 100644 include/soc/qcom/qcom_minidump.h >
On Mon, Mar 06, 2023 at 08:58:04PM +0530, Mukesh Ojha wrote:
> Friendly review reminder..
It is a few hours after the merge window closed, please be patient.
And to help out, please review other submissions to reduce the review
load on maintainers. To not do that is just asking for others to do
work for you without any help, right?
thanks,
greg k-h
On Mon, Feb 27, 2023 at 03:45:31PM +0530, Mukesh Ojha wrote: > > > On 2/25/2023 12:36 AM, Brian Masney wrote: > > Hi Mukesh, > > > > On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote: > > > On 2/23/2023 6:07 PM, Brian Masney wrote: > > > > I'd like to test this series plus your series that sets the multiple > > > > download modes. > > > > > > Sure, you are welcome, but for that you need a device running with Qualcomm > > > SoC and if it has a upstream support. > > > > I will be testing this series on a sa8540p (QDrive3 Automotive > > Development Board), which has the sc8280xp SoC with good upstream > > support. This is also the same board that I have a reliable way to > > make the board crash due to a known firmware bug. > > > > > Can you try below patch to just select minidump download mode and make the > device crash ? > > --------------------------------------->8------------------------------- > diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi > b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi > index 0d02599..bd8e1a8 100644 > --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi > +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi > @@ -280,6 +280,7 @@ > firmware { > scm: scm { > compatible = "qcom,scm-sc8280xp", "qcom,scm"; > + qcom,dload-mode = <&tcsr 0x13000>; > }; > }; > > diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c > index cdbfe54..e1539a2 100644 > --- a/drivers/firmware/qcom_scm.c > +++ b/drivers/firmware/qcom_scm.c > @@ -20,7 +20,7 @@ > > #include "qcom_scm.h" > > -static bool download_mode = > IS_ENABLED(CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT); > +static bool download_mode = true; > module_param(download_mode, bool, 0); > > #define SCM_HAS_CORE_CLK BIT(0) > @@ -427,7 +427,7 @@ static void qcom_scm_set_download_mode(bool enable) > ret = __qcom_scm_set_dload_mode(__scm->dev, enable); > } else if (__scm->dload_mode_addr) { > ret = qcom_scm_io_writel(__scm->dload_mode_addr, > - enable ? QCOM_SCM_BOOT_SET_DLOAD_MODE : 0); > + enable ? 0x20 : 0); > } else { > dev_err(__scm->dev, > "No available mechanism for setting download > mode\n"); Hi Mukesh, I tried to test this series but I don't know how to actually use the minidump feature that's in this series. Some more documentation is needed. I added this series, plus your other series that adds the download modes to the SCM driver to my tree, along with your changes above. I downgraded the firmware on my sa8540p and I have my reproducible crash. Linux immediately loses control and the board firmware takes over. I assumed that I'd need to do a warm reboot so that DDR contents are still present so Linux can grab the memory contents on next reboot. However, 'fastboot devices' shows no devices so I can't reboot that way. I can do a cold boot but the DDR contents will be lost. Also this series needs to be rebased against 6.3rc1. Brian