Message ID | 20190624123540.20629-1-geert+renesas@glider.be (mailing list archive) |
---|---|
Headers | show |
Series | serial: sh-sci: Fix .flush_buffer() issues | expand |
Hi Geert, CC: George On Mon, Jun 24, 2019 at 02:35:38PM +0200, Geert Uytterhoeven wrote: > Hi Greg, Jiri, > > This patch series attempts to fix the issues Eugeniu Rosca reported > seeing, where .flush_buffer() interfered with transmit DMA operation[*]. > > There's a third patch "dmaengine: rcar-dmac: Reject zero-length slave > DMA requests", which is related to the issue, but further independent, > hence submitted separately. > > Eugeniu: does this fix the issues you were seeing? Many thanks for both sh-sci and the rcar-dmac patches. The fixes are very much appreciated. > Geert Uytterhoeven (2): > serial: sh-sci: Fix TX DMA buffer flushing and workqueue races > serial: sh-sci: Terminate TX DMA during buffer flushing > > drivers/tty/serial/sh-sci.c | 33 ++++++++++++++++++++++++--------- > 1 file changed, 24 insertions(+), 9 deletions(-) I reserved some time to get a feeling about how the patches behave on a real system (H3-ES2.0-ULCB-KF-M06), so here come my observations. First of all, the issue I have originally reported in [0] is only reproducible in absence of [4]. So, one of my questions would be how do you yourself see the relationship between [1-3] and [4]? That said, all my testing assumes: - Vanilla tip v5.2-rc6-15-g249155c20f9b with [4] reverted. - DEBUG is undefined in {sh-sci.c,rcar-dmac.c}, since I've noticed new issues arising in the debug build, which are unrelated to [0]. Below is the summary of my findings: Version IS [0] Is console Error message when (vanilla+X) reproduced? usable after [0] [0] is reproduced is reproduced? ------------------------------------------------------------ -[4] Yes No [5] -[4]+[1] Yes No - -[4]+[2] Yes Yes [5] -[4]+[3] Yes Yes [6] -[4]+[1]+[2] No - - -[4]+[1]+[2]+[3] No - - pure vanilla No - - This looks a little too verbose, but I thought it might be interesting. The story which I see is that [1] does not fix [0] alone, but it seems to depend on [2]. Furthermore, if cherry picked alone, [1] makes the matters somewhat worse in the sense that it hides the error [5]. My only question is whether [1-3] are supposed to replace [4] or they are supposed to happily coexist. Since I don't see [0] being reproduced with [1-3], I personally prefer to re-enable DMA on SCIF (when the latter is used as console) so that more features and code paths are exercised to increase test coverage. [0] https://lore.kernel.org/lkml/20190504004258.23574-3-erosca@de.adit-jv.com/ [1] https://patchwork.kernel.org/patch/11012983/ ("serial: sh-sci: Fix TX DMA buffer flushing and workqueue races") [2] https://patchwork.kernel.org/patch/11012987/ ("serial: sh-sci: Terminate TX DMA during buffer flushing") [3] https://patchwork.kernel.org/patch/11012991/ ("dmaengine: rcar-dmac: Reject zero-length slave DMA requests") [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=099506cbbc79c0 ("serial: sh-sci: disable DMA for uart_console") [5] rcar-dmac e7300000.dma-controller: Channel Address Error [6] rcar-dmac e7300000.dma-controller: rcar_dmac_prep_slave_sg: bad parameter: len=1, id=19 sh-sci e6e88000.serial: Failed preparing Tx DMA descriptor Thanks!
Hi Eugeniu, On Wed, Jun 26, 2019 at 7:34 PM Eugeniu Rosca <roscaeugeniu@gmail.com> wrote: > On Mon, Jun 24, 2019 at 02:35:38PM +0200, Geert Uytterhoeven wrote: > > This patch series attempts to fix the issues Eugeniu Rosca reported > > seeing, where .flush_buffer() interfered with transmit DMA operation[*]. > > > > There's a third patch "dmaengine: rcar-dmac: Reject zero-length slave > > DMA requests", which is related to the issue, but further independent, > > hence submitted separately. > > > > Eugeniu: does this fix the issues you were seeing? > > Many thanks for both sh-sci and the rcar-dmac patches. > The fixes are very much appreciated. > > > Geert Uytterhoeven (2): > > serial: sh-sci: Fix TX DMA buffer flushing and workqueue races > > serial: sh-sci: Terminate TX DMA during buffer flushing > > > > drivers/tty/serial/sh-sci.c | 33 ++++++++++++++++++++++++--------- > > 1 file changed, 24 insertions(+), 9 deletions(-) > > I reserved some time to get a feeling about how the patches behave on > a real system (H3-ES2.0-ULCB-KF-M06), so here come my observations. Thanks for your extensive testing! > First of all, the issue I have originally reported in [0] is only > reproducible in absence of [4]. So, one of my questions would be how > do you yourself see the relationship between [1-3] and [4]? I consider them independent. Just applying [4] would fix the issue for the console only, while the race condition can still be triggered on other serial ports. > That said, all my testing assumes: > - Vanilla tip v5.2-rc6-15-g249155c20f9b with [4] reverted. > - DEBUG is undefined in {sh-sci.c,rcar-dmac.c}, since I've noticed > new issues arising in the debug build, which are unrelated to [0]. > > Below is the summary of my findings: > > Version IS [0] Is console Error message when > (vanilla+X) reproduced? usable after [0] [0] is reproduced > is reproduced? > ------------------------------------------------------------ > -[4] Yes No [5] > -[4]+[1] Yes No - > -[4]+[2] Yes Yes [5] > -[4]+[3] Yes Yes [6] > -[4]+[1]+[2] No - - > -[4]+[1]+[2]+[3] No - - > pure vanilla No - - > > This looks a little too verbose, but I thought it might be interesting. Thanks, it's very helpful to provide these results. > The story which I see is that [1] does not fix [0] alone, but it seems > to depend on [2]. Furthermore, if cherry picked alone, [1] makes the > matters somewhat worse in the sense that it hides the error [5]. OK. > My only question is whether [1-3] are supposed to replace [4] or they > are supposed to happily coexist. Since I don't see [0] being reproduced They are meant to coexist. > with [1-3], I personally prefer to re-enable DMA on SCIF (when the > latter is used as console) so that more features and code paths are > exercised to increase test coverage. If a serial port is used as a console, the port is used for both DMA (normal use) and PIO (serial console output). The latter can have a negative impact on the former, aggravating existing bugs, or triggering more races, even in the hardware. So I think it's better to be more cautious and keep DMA disabled for the console. > [0] https://lore.kernel.org/lkml/20190504004258.23574-3-erosca@de.adit-jv.com/ > [1] https://patchwork.kernel.org/patch/11012983/ > ("serial: sh-sci: Fix TX DMA buffer flushing and workqueue races") > [2] https://patchwork.kernel.org/patch/11012987/ > ("serial: sh-sci: Terminate TX DMA during buffer flushing") > [3] https://patchwork.kernel.org/patch/11012991/ > ("dmaengine: rcar-dmac: Reject zero-length slave DMA requests") > [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=099506cbbc79c0 > ("serial: sh-sci: disable DMA for uart_console") > > [5] rcar-dmac e7300000.dma-controller: Channel Address Error > [6] rcar-dmac e7300000.dma-controller: rcar_dmac_prep_slave_sg: bad parameter: len=1, id=19 > sh-sci e6e88000.serial: Failed preparing Tx DMA descriptor Gr{oetje,eeting}s, Geert
Hi Geert, On Fri, Jun 28, 2019 at 01:51:25PM +0200, Geert Uytterhoeven wrote: [..] > If a serial port is used as a console, the port is used for both DMA > (normal use) and PIO (serial console output). The latter can have a > negative impact on the former, aggravating existing bugs, or triggering > more races, even in the hardware. So I think it's better to be more > cautious and keep DMA disabled for the console. Thanks for the extensive and comprehensible replies. No more questions from my end. Looking forward to picking the patches from vanilla/stable trees.
> > If a serial port is used as a console, the port is used for both DMA > > (normal use) and PIO (serial console output). The latter can have a > > negative impact on the former, aggravating existing bugs, or triggering > > more races, even in the hardware. So I think it's better to be more > > cautious and keep DMA disabled for the console. > > Thanks for the extensive and comprehensible replies. > No more questions from my end. > Looking forward to picking the patches from vanilla/stable trees. If you could formally add such a tag: Tested-by: <your email> (maybe also Acked-by: or Reviewed-by:, dunno if you think it is apropriate) to the patches, this would be much appreciated and will usually speed up the patches getting applied. Thanks for your help!
Hi Wolfram, On Fri, Jun 28, 2019 at 02:55:34PM +0200, Wolfram Sang wrote: [..] > If you could formally add such a tag: > > Tested-by: <your email> > > (maybe also Acked-by: or Reviewed-by:, dunno if you think it is > apropriate) > > to the patches, this would be much appreciated and will usually speed up > the patches getting applied. > > Thanks for your help! I am doing this per-patch to allow patchwork to reflect the status of each patch on the linux-renesas-soc front-page. AFAIK patchwork ignores series-wide '*-by: Name <email>' signatures/tags.
> I am doing this per-patch to allow patchwork to reflect the status of > each patch on the linux-renesas-soc front-page. AFAIK patchwork ignores > series-wide '*-by: Name <email>' signatures/tags. AFAIK, yes. Thanks!
Hello All, On Fri, Jun 28, 2019 at 01:51:25PM +0200, Geert Uytterhoeven wrote: > Hi Eugeniu, > > On Wed, Jun 26, 2019 at 7:34 PM Eugeniu Rosca <roscaeugeniu@gmail.com> wrote: > > On Mon, Jun 24, 2019 at 02:35:38PM +0200, Geert Uytterhoeven wrote: > > > This patch series attempts to fix the issues Eugeniu Rosca reported > > > seeing, where .flush_buffer() interfered with transmit DMA operation[*]. > > > > > > There's a third patch "dmaengine: rcar-dmac: Reject zero-length slave > > > DMA requests", which is related to the issue, but further independent, > > > hence submitted separately. > > > > > > Eugeniu: does this fix the issues you were seeing? > > > > Many thanks for both sh-sci and the rcar-dmac patches. > > The fixes are very much appreciated. > > > > > Geert Uytterhoeven (2): > > > serial: sh-sci: Fix TX DMA buffer flushing and workqueue races > > > serial: sh-sci: Terminate TX DMA during buffer flushing > > > > > > drivers/tty/serial/sh-sci.c | 33 ++++++++++++++++++++++++--------- > > > 1 file changed, 24 insertions(+), 9 deletions(-) > > > > I reserved some time to get a feeling about how the patches behave on > > a real system (H3-ES2.0-ULCB-KF-M06), so here come my observations. > > Thanks for your extensive testing! > > > First of all, the issue I have originally reported in [0] is only > > reproducible in absence of [4]. So, one of my questions would be how > > do you yourself see the relationship between [1-3] and [4]? > > I consider them independent. > Just applying [4] would fix the issue for the console only, while the > race condition can still be triggered on other serial ports. > > > That said, all my testing assumes: > > - Vanilla tip v5.2-rc6-15-g249155c20f9b with [4] reverted. > > - DEBUG is undefined in {sh-sci.c,rcar-dmac.c}, since I've noticed > > new issues arising in the debug build, which are unrelated to [0]. > > > > Below is the summary of my findings: > > > > Version IS [0] Is console Error message when > > (vanilla+X) reproduced? usable after [0] [0] is reproduced > > is reproduced? > > ------------------------------------------------------------ > > -[4] Yes No [5] > > -[4]+[1] Yes No - > > -[4]+[2] Yes Yes [5] > > -[4]+[3] Yes Yes [6] > > -[4]+[1]+[2] No - - > > -[4]+[1]+[2]+[3] No - - > > pure vanilla No - - > > > > This looks a little too verbose, but I thought it might be interesting. > > Thanks, it's very helpful to provide these results. > > > The story which I see is that [1] does not fix [0] alone, but it seems > > to depend on [2]. Furthermore, if cherry picked alone, [1] makes the > > matters somewhat worse in the sense that it hides the error [5]. > > OK. > > > My only question is whether [1-3] are supposed to replace [4] or they > > are supposed to happily coexist. Since I don't see [0] being reproduced > > They are meant to coexist. > > > with [1-3], I personally prefer to re-enable DMA on SCIF (when the > > latter is used as console) so that more features and code paths are > > exercised to increase test coverage. > > If a serial port is used as a console, the port is used for both DMA > (normal use) and PIO (serial console output). The latter can have a > negative impact on the former, aggravating existing bugs, or triggering > more races, even in the hardware. So I think it's better to be more > cautious and keep DMA disabled for the console. Agreed. Just a note for the record that [4] was the easiest way to resolve the reported problem [0] but an alternative solution would be to implement DMA support for ttySC console ports which will be non-trivial to implement and test due to the potential for deadlocks in console write critical paths where various locks are held with interrupts disabled. I see only one tty serial driver which implements console DMA support, drivers/tty/serial/mpsc.c, and perhaps there is a good reason why there are no other examples? > > [0] https://lore.kernel.org/lkml/20190504004258.23574-3-erosca@de.adit-jv.com/ > > [1] https://patchwork.kernel.org/patch/11012983/ > > ("serial: sh-sci: Fix TX DMA buffer flushing and workqueue races") > > [2] https://patchwork.kernel.org/patch/11012987/ > > ("serial: sh-sci: Terminate TX DMA during buffer flushing") > > [3] https://patchwork.kernel.org/patch/11012991/ > > ("dmaengine: rcar-dmac: Reject zero-length slave DMA requests") > > [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=099506cbbc79c0 > > ("serial: sh-sci: disable DMA for uart_console") > > > > [5] rcar-dmac e7300000.dma-controller: Channel Address Error > > [6] rcar-dmac e7300000.dma-controller: rcar_dmac_prep_slave_sg: bad parameter: len=1, id=19 > > sh-sci e6e88000.serial: Failed preparing Tx DMA descriptor > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds
On Fri, Jun 28, 2019 at 03:02:00PM +0200, Eugeniu Rosca wrote: > Hi Wolfram, > > On Fri, Jun 28, 2019 at 02:55:34PM +0200, Wolfram Sang wrote: > [..] > > If you could formally add such a tag: > > > > Tested-by: <your email> > > > > (maybe also Acked-by: or Reviewed-by:, dunno if you think it is > > apropriate) > > > > to the patches, this would be much appreciated and will usually speed up > > the patches getting applied. > > > > Thanks for your help! > > I am doing this per-patch to allow patchwork to reflect the status of > each patch on the linux-renesas-soc front-page. AFAIK patchwork ignores > series-wide '*-by: Name <email>' signatures/tags. I don't use patchwork :)
Hi Greg, On Wed, Jul 03, 2019 at 07:30:50PM +0200, Greg Kroah-Hartman wrote: > On Fri, Jun 28, 2019 at 03:02:00PM +0200, Eugeniu Rosca wrote: [..] > > I am doing this per-patch to allow patchwork to reflect the status of > > each patch on the linux-renesas-soc front-page. AFAIK patchwork ignores > > series-wide '*-by: Name <email>' signatures/tags. > > I don't use patchwork :) How do you then collect all the "{Reviewed,Tested,etc}-by:" signatures (each of which means sometimes hours of effort) in the hairy ML threads? Patchwork makes it a matter of one click.
On Wed, Jul 03, 2019 at 08:15:19PM +0200, Eugeniu Rosca wrote: > Hi Greg, > > On Wed, Jul 03, 2019 at 07:30:50PM +0200, Greg Kroah-Hartman wrote: > > On Fri, Jun 28, 2019 at 03:02:00PM +0200, Eugeniu Rosca wrote: > [..] > > > I am doing this per-patch to allow patchwork to reflect the status of > > > each patch on the linux-renesas-soc front-page. AFAIK patchwork ignores > > > series-wide '*-by: Name <email>' signatures/tags. > > > > I don't use patchwork :) > > How do you then collect all the "{Reviewed,Tested,etc}-by:" signatures > (each of which means sometimes hours of effort) in the hairy ML threads? > Patchwork makes it a matter of one click. I've been doing this for a very long time now, before patchwork was even around. It's pretty trivial to collect them on my own. thanks, greg k-h
> I've been doing this for a very long time now, before patchwork was even > around. It's pretty trivial to collect them on my own. It's a workflow thing. Mileages vary. I ask people to tag patches individually for reasonable small series.