Message ID | 57ff1e429a7fc12300a3cca0c5e9a637beed0d3a.1518619058.git.msuchanek@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On February 14, 2018 6:38:58 AM PST, Michal Suchanek <msuchanek@suse.de> wrote: >The previous patch does reset during hardware error so make the reset >progress more visible. Based on your previous email it looks like this can happen quite frequently so we might be spamming the kernel log with such reset messages. Turning this into a debug print would not be great either, how about a custom sysfs attribute counting the number of times a reset was done? We should ideally root cause this but I am sure we can.
On February 14, 2018 6:38:58 AM PST, Michal Suchanek <msuchanek@suse.de> wrote: >The previous patch does reset during hardware error so make the reset >progress more visible. Based on your previous email it looks like this can happen quite frequently so we might be spamming the kernel log with such reset messages. Turning this into a debug print would not be great either, how about a custom sysfs attribute counting the number of times a reset was done? We should ideally root cause this but I am sure we can.
On Wed, 14 Feb 2018 08:50:16 -0800 Florian Fainelli <f.fainelli@gmail.com> wrote: > On February 14, 2018 6:38:58 AM PST, Michal Suchanek > <msuchanek@suse.de> wrote: > >The previous patch does reset during hardware error so make the reset > >progress more visible. > > Based on your previous email it looks like this can happen quite > frequently so we might be spamming the kernel log with such reset > messages. Turning this into a debug print would not be great either, > how about a custom sysfs attribute counting the number of times a > reset was done? Since every such message happens when the system stalls for like half a minute I don't think there will be that many until somebody notices something is amiss. It might be also helpful in diagnosing if other cards lock up in different way - for me the DMA shutdown is short so I guess it's the mmc host that is locked up and the DMA engine is fine. It might look differently on different systems, though. I understand that adding messages it somewhat controversial so I added them in separate patch. Thanks Michal
Hi Michal, > Michal Suchánek <msuchanek@suse.de> hat am 14. Februar 2018 um 20:47 geschrieben: > > > On Wed, 14 Feb 2018 08:50:16 -0800 > Florian Fainelli <f.fainelli@gmail.com> wrote: > > > On February 14, 2018 6:38:58 AM PST, Michal Suchanek > > <msuchanek@suse.de> wrote: > > >The previous patch does reset during hardware error so make the reset > > >progress more visible. > > > > Based on your previous email it looks like this can happen quite > > frequently so we might be spamming the kernel log with such reset > > messages. Turning this into a debug print would not be great either, > > how about a custom sysfs attribute counting the number of times a > > reset was done? > > Since every such message happens when the system stalls for like half a > minute I don't think there will be that many until somebody notices > something is amiss. It might be also helpful in diagnosing if other > cards lock up in different way - for me the DMA shutdown is short so I > guess it's the mmc host that is locked up and the DMA engine is fine. > It might look differently on different systems, though. FWIW according to your dmesg your RPi doesn't use the DMA engine: [ 5.004609] sdhost-bcm2835 3f202000.sdhost: unable to initialise DMA channel. Falling back to PIO [ 5.154518] sdhost-bcm2835 3f202000.sdhost: loaded - DMA disabled For me it's a chicken and egg problem if the DMA driver is build as a kernel module. Stefan > > I understand that adding messages it somewhat controversial so I added > them in separate patch. > > Thanks > > Michal
On Thu, 15 Feb 2018 19:22:00 +0100 (CET) Stefan Wahren <stefan.wahren@i2se.com> wrote: > Hi Michal, > > > Michal Suchánek <msuchanek@suse.de> hat am 14. Februar 2018 um > > 20:47 geschrieben: > > > > > > On Wed, 14 Feb 2018 08:50:16 -0800 > > Florian Fainelli <f.fainelli@gmail.com> wrote: > > > > > On February 14, 2018 6:38:58 AM PST, Michal Suchanek > > > <msuchanek@suse.de> wrote: > > > >The previous patch does reset during hardware error so make the > > > >reset progress more visible. > > > > > > Based on your previous email it looks like this can happen quite > > > frequently so we might be spamming the kernel log with such reset > > > messages. Turning this into a debug print would not be great > > > either, how about a custom sysfs attribute counting the number of > > > times a reset was done? > > > > Since every such message happens when the system stalls for like > > half a minute I don't think there will be that many until somebody > > notices something is amiss. It might be also helpful in diagnosing > > if other cards lock up in different way - for me the DMA shutdown > > is short so I guess it's the mmc host that is locked up and the DMA > > engine is fine. It might look differently on different systems, > > though. > > FWIW according to your dmesg your RPi doesn't use the DMA engine: > > [ 5.004609] sdhost-bcm2835 3f202000.sdhost: unable to initialise DMA > channel. Falling back to PIO [ 5.154518] sdhost-bcm2835 > 3f202000.sdhost: loaded - DMA disabled > > For me it's a chicken and egg problem if the DMA driver is build as a > kernel module. It can be included in the ramdisk but somebody would have to add it to the list of required modules because the dependency is non-obvious. Thanks Michal
diff --git a/drivers/mmc/host/bcm2835.c b/drivers/mmc/host/bcm2835.c index ce05fe72f865..4dde8b2b62a9 100644 --- a/drivers/mmc/host/bcm2835.c +++ b/drivers/mmc/host/bcm2835.c @@ -283,10 +283,14 @@ static void bcm2835_reset_internal(struct bcm2835_host *host) static void bcm2835_reset(struct mmc_host *mmc) { struct bcm2835_host *host = mmc_priv(mmc); + struct device *dev = &host->pdev->dev; - if (host->dma_chan) + if (host->dma_chan) { + dev_info(dev, "tearing down dma"); dmaengine_terminate_sync(host->dma_chan); + } host->dma_chan = NULL; + dev_info(dev, "resetting"); bcm2835_reset_internal(host); }
The previous patch does reset during hardware error so make the reset progress more visible. Signed-off-by: Michal Suchanek <msuchanek@suse.de> --- drivers/mmc/host/bcm2835.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)