mbox series

[v2,0/3] Improve error handling in the rcar-vin driver

Message ID 20220628180024.451258-1-mrodin@de.adit-jv.com (mailing list archive)
Headers show
Series Improve error handling in the rcar-vin driver | expand

Message

Michael Rodin June 28, 2022, 6 p.m. UTC
Hello,

this series is a followup to the other series [1] started by Niklas Söderlund
where only the first patch has been merged. The overall idea is to be more
compliant with the Renesas hardware manual which requires a reset or stop
of capture in the VIN module before reset of CSI2. Another goal is to be
more resilient with respect to non-critical CSI2 errors so the driver does
not end in an endless restart loop. Compared to the previous version [2] of
this series the patch 3 is replaced based on the conclusion in [3] so now
userspace has to take care of figuring out if a transfer error was harmless
or unrecoverable. Other patches are adapted accordingly so no assumptions
about criticality of transfer errors are made in the kernel and the
decision is left up to userspace.

[1] https://lore.kernel.org/linux-renesas-soc/20211108160220.767586-1-niklas.soderlund+renesas@ragnatech.se/
[2] https://lore.kernel.org/all/1652983210-1194-1-git-send-email-mrodin@de.adit-jv.com/
[3] https://lore.kernel.org/all/YqEO3%2FKekkZhVjW+@oden.dyn.berto.se/

Comments

Niklas Söderlund July 5, 2022, 9:46 a.m. UTC | #1
Hi Michael,

Thanks for your persistent work with this series.

On 2022-06-28 20:00:19 +0200, Michael Rodin wrote:
> Hello,
> 
> this series is a followup to the other series [1] started by Niklas Söderlund
> where only the first patch has been merged. The overall idea is to be more
> compliant with the Renesas hardware manual which requires a reset or stop
> of capture in the VIN module before reset of CSI2. Another goal is to be
> more resilient with respect to non-critical CSI2 errors so the driver does
> not end in an endless restart loop. Compared to the previous version [2] of
> this series the patch 3 is replaced based on the conclusion in [3] so now
> userspace has to take care of figuring out if a transfer error was harmless
> or unrecoverable. Other patches are adapted accordingly so no assumptions
> about criticality of transfer errors are made in the kernel and the
> decision is left up to userspace.

I like this solution as it truly pushes the decision to user-space. What 
bugs me a little bit is that we don't have a way to communicate errors 
that we know are unrecoverable (it was for this case the work in this 
area started) and ones that could be recoverable (the use-case added on 
top).

I would also like to hear what Hans thinks as he had good suggestions 
for how to handle the cases we know can't be recovers in [4].

> 
> [1] https://lore.kernel.org/linux-renesas-soc/20211108160220.767586-1-niklas.soderlund+renesas@ragnatech.se/
> [2] https://lore.kernel.org/all/1652983210-1194-1-git-send-email-mrodin@de.adit-jv.com/
> [3] https://lore.kernel.org/all/YqEO3%2FKekkZhVjW+@oden.dyn.berto.se/

4. https://lore.kernel.org/all/1fddc966-5a23-63b4-185e-c17aa6d65b54@xs4all.nl/
Michael Rodin July 15, 2022, 1:42 p.m. UTC | #2
Hi Niklas, Hans,

On Tue, Jul 05, 2022 at 11:46:22AM +0200, Niklas Söderlund wrote:
> Hi Michael,
> 
> Thanks for your persistent work with this series.

Thank you for the feedback!

> On 2022-06-28 20:00:19 +0200, Michael Rodin wrote:
> > Hello,
> > 
> > this series is a followup to the other series [1] started by Niklas Söderlund
> > where only the first patch has been merged. The overall idea is to be more
> > compliant with the Renesas hardware manual which requires a reset or stop
> > of capture in the VIN module before reset of CSI2. Another goal is to be
> > more resilient with respect to non-critical CSI2 errors so the driver does
> > not end in an endless restart loop. Compared to the previous version [2] of
> > this series the patch 3 is replaced based on the conclusion in [3] so now
> > userspace has to take care of figuring out if a transfer error was harmless
> > or unrecoverable. Other patches are adapted accordingly so no assumptions
> > about criticality of transfer errors are made in the kernel and the
> > decision is left up to userspace.
> 
> I like this solution as it truly pushes the decision to user-space. What 
> bugs me a little bit is that we don't have a way to communicate errors 
> that we know are unrecoverable (it was for this case the work in this 
> area started) and ones that could be recoverable (the use-case added on 
> top).

Yes, it's not nice that V4L2_EVENT_XFER_ERROR does not tell userspace
whether an error is recoverable (i.e. the event can be ignored) or not
(i.e. a restart of streaming is required) but the other possible option
would be (as concluded in [3]) to implement a frame timeout monitoring
thread in v4l2 core. I am not sure if it is possible to implement this
second option cleanly...

> I would also like to hear what Hans thinks as he had good suggestions 
> for how to handle the cases we know can't be recovers in [4].

A a new function vb2_queue_error_with_event() suggested by Hans seems to be
redundant now, since it would not be used by rcar-vin (unless we implement
frame timeout monitoring in the v4l2 core). Or do you have an idea, which
drivers could be the first users of it, e.g. staging/media/imx I mentioned
before?

> > 
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Drenesas-2Dsoc_20211108160220.767586-2D1-2Dniklas.soderlund-2Brenesas-40ragnatech.se_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=ecX7IwfatUO7SNPiyQ6x_8K9t2eWJf3y8GNuNHJ_0W0&s=Cli6jADEgMmCOLVoFekRRXzmty9WBXtoSF9utZJNMXY&e=
> > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_1652983210-2D1194-2D1-2Dgit-2Dsend-2Demail-2Dmrodin-40de.adit-2Djv.com_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=ecX7IwfatUO7SNPiyQ6x_8K9t2eWJf3y8GNuNHJ_0W0&s=6CysfSY0OoAenEwCzigeyPOb8vyaa4GgzkJSR-ny83U&e=
> > [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_YqEO3-252FKekkZhVjW-2B-40oden.dyn.berto.se_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=ecX7IwfatUO7SNPiyQ6x_8K9t2eWJf3y8GNuNHJ_0W0&s=67JE_QR4x7omrtC7wzbpn2OgW75TAR80-R8WQyE-bVo&e=
> 
> 4. https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_1fddc966-2D5a23-2D63b4-2D185e-2Dc17aa6d65b54-40xs4all.nl_&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=sWsgk3pKkv5GeIDM2RZlPY8TjNFU2D0oBeOj6QNBadE&m=ecX7IwfatUO7SNPiyQ6x_8K9t2eWJf3y8GNuNHJ_0W0&s=I18yWgde2UKZY4AiwB5s-Lf12eebHOcHFZFOlTcO2oQ&e=
> 
> -- 
> Kind Regards,
> Niklas Söderlund