Message ID | 20200817165911.32589-2-luca@lucaceresoli.net (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [1/3] fpga manager: xilinx-spi: remove stray comment | expand |
The other two patches are fine. On 8/17/20 9:59 AM, Luca Ceresoli wrote: > When the DONE pin does not go high after programming to confirm programming > success, the INIT_B pin provides some info on the reason. Use it if > available to provide a more explanatory error message. > > Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> > --- > drivers/fpga/xilinx-spi.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c > index 502fae0d1d85..2aa942bb1114 100644 > --- a/drivers/fpga/xilinx-spi.c > +++ b/drivers/fpga/xilinx-spi.c > @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, > return xilinx_spi_apply_cclk_cycles(conf); > } > > - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); > + if (conf->init_b) { > + int init_b_asserted = gpiod_get_value(conf->init_b); gpiod_get_value can fail. So maybe need split the first statement. init_b_asserted < 0 ? "invalid device" As the if-else statement is getting complicated, embedding the ? : makes this hard to read. 'if,else if, else' would be better. > + > + dev_err(&mgr->dev, > + init_b_asserted ? "CRC error or invalid device\n" > + : "Missing sync word or incomplete bitstream\n"); > + } else { > + dev_err(&mgr->dev, "Timeout after config data transfer.\n"); patch 3 removes '.' s , and you just added one back in ? > + } > + > return -ETIMEDOUT; > } > Reviewed-by: Tom Rix <trix@redhat.com>
[a question for GPIO maintainers below] Hi Tom, thanks for your review! On 17/08/20 20:15, Tom Rix wrote: > The other two patches are fine. > > On 8/17/20 9:59 AM, Luca Ceresoli wrote: >> When the DONE pin does not go high after programming to confirm programming >> success, the INIT_B pin provides some info on the reason. Use it if >> available to provide a more explanatory error message. >> >> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >> --- >> drivers/fpga/xilinx-spi.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c >> index 502fae0d1d85..2aa942bb1114 100644 >> --- a/drivers/fpga/xilinx-spi.c >> +++ b/drivers/fpga/xilinx-spi.c >> @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, >> return xilinx_spi_apply_cclk_cycles(conf); >> } >> >> - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); >> + if (conf->init_b) { >> + int init_b_asserted = gpiod_get_value(conf->init_b); > > gpiod_get_value can fail. So maybe need split the first statement. > > init_b_asserted < 0 ? "invalid device" > > As the if-else statement is getting complicated, embedding the ? : makes this hard to read. 'if,else if, else' would be better. Thanks for the heads up. However I'm not sure which is the best thing to do here. First, I've been reading the libgpiod code after your email and yes, the libgpiod code _could_ return runtime errors received from the gpiochip driver, even though the docs state: > The get/set calls do not return errors because “invalid GPIO”> should have been reported earlier from gpiod_direction_*(). (https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html) On the other hand there are plenty of calls to gpiod_get/set_value in the kernel that don't check for error values. I guess this is because failures getting/setting a GPIO are very uncommon (perhaps impossible with platform GPIO). When still a GPIO get/set operation fails I'm not sure adding thousands of error-checking code lines in hundreds of drivers is the best way to go. I feel like we should have a unique, noisy dev_err() in the error path in libgpio but I was surprised in not finding any [1]. Linus, Bartosz, what's your opinion? Should all drivers check for errors after every gpiod_[sg]et_value*() call? >> + dev_err(&mgr->dev, >> + init_b_asserted ? "CRC error or invalid device\n" >> + : "Missing sync word or incomplete bitstream\n"); >> + } else { >> + dev_err(&mgr->dev, "Timeout after config data transfer.\n"); > patch 3 removes '.' s , and you just added one back in ? Here I'm only changing indentation of this line. But OK, this is misleading, so I'll swap patches 2 and 3 in the next patch iteration to avoid confusion. [1] https://elixir.bootlin.com/linux/v5.8/source/drivers/gpio/gpiolib.c#L3646
On 8/18/20 3:20 AM, Luca Ceresoli wrote: > [a question for GPIO maintainers below] > > Hi Tom, > > thanks for your review! > > On 17/08/20 20:15, Tom Rix wrote: >> The other two patches are fine. >> >> On 8/17/20 9:59 AM, Luca Ceresoli wrote: >>> When the DONE pin does not go high after programming to confirm programming >>> success, the INIT_B pin provides some info on the reason. Use it if >>> available to provide a more explanatory error message. >>> >>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >>> --- >>> drivers/fpga/xilinx-spi.c | 11 ++++++++++- >>> 1 file changed, 10 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c >>> index 502fae0d1d85..2aa942bb1114 100644 >>> --- a/drivers/fpga/xilinx-spi.c >>> +++ b/drivers/fpga/xilinx-spi.c >>> @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, >>> return xilinx_spi_apply_cclk_cycles(conf); >>> } >>> >>> - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); >>> + if (conf->init_b) { >>> + int init_b_asserted = gpiod_get_value(conf->init_b); >> gpiod_get_value can fail. So maybe need split the first statement. >> >> init_b_asserted < 0 ? "invalid device" >> >> As the if-else statement is getting complicated, embedding the ? : makes this hard to read. 'if,else if, else' would be better. > Thanks for the heads up. However I'm not sure which is the best thing to > do here. > > First, I've been reading the libgpiod code after your email and yes, the > libgpiod code _could_ return runtime errors received from the gpiochip > driver, even though the docs state: > >> The get/set calls do not return errors because “invalid GPIO”> should have been reported earlier from gpiod_direction_*(). > (https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html) > > On the other hand there are plenty of calls to gpiod_get/set_value in > the kernel that don't check for error values. I guess this is because > failures getting/setting a GPIO are very uncommon (perhaps impossible > with platform GPIO). > > When still a GPIO get/set operation fails I'm not sure adding thousands > of error-checking code lines in hundreds of drivers is the best way to > go. I feel like we should have a unique, noisy dev_err() in the error > path in libgpio but I was surprised in not finding any [1]. > > Linus, Bartosz, what's your opinion? Should all drivers check for errors > after every gpiod_[sg]et_value*() call? My opinion is that you know the driver / hw is in a bad state and you are trying to convey useful information. So you should be as careful as possible and not assume gpio did not fail. > >>> + dev_err(&mgr->dev, >>> + init_b_asserted ? "CRC error or invalid device\n" >>> + : "Missing sync word or incomplete bitstream\n"); >>> + } else { >>> + dev_err(&mgr->dev, "Timeout after config data transfer.\n"); >> patch 3 removes '.' s , and you just added one back in ? > Here I'm only changing indentation of this line. But OK, this is > misleading, so I'll swap patches 2 and 3 in the next patch iteration to > avoid confusion. Maybe just remove the '.' at the same time and/or collapse 2&3 into a single patch. > > [1] > https://elixir.bootlin.com/linux/v5.8/source/drivers/gpio/gpiolib.c#L3646 >
On 18/08/20 16:21, Tom Rix wrote: > > On 8/18/20 3:20 AM, Luca Ceresoli wrote: >> [a question for GPIO maintainers below] >> >> Hi Tom, >> >> thanks for your review! >> >> On 17/08/20 20:15, Tom Rix wrote: >>> The other two patches are fine. >>> >>> On 8/17/20 9:59 AM, Luca Ceresoli wrote: >>>> When the DONE pin does not go high after programming to confirm programming >>>> success, the INIT_B pin provides some info on the reason. Use it if >>>> available to provide a more explanatory error message. >>>> >>>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >>>> --- >>>> drivers/fpga/xilinx-spi.c | 11 ++++++++++- >>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c >>>> index 502fae0d1d85..2aa942bb1114 100644 >>>> --- a/drivers/fpga/xilinx-spi.c >>>> +++ b/drivers/fpga/xilinx-spi.c >>>> @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, >>>> return xilinx_spi_apply_cclk_cycles(conf); >>>> } >>>> >>>> - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); >>>> + if (conf->init_b) { >>>> + int init_b_asserted = gpiod_get_value(conf->init_b); >>> gpiod_get_value can fail. So maybe need split the first statement. >>> >>> init_b_asserted < 0 ? "invalid device" >>> >>> As the if-else statement is getting complicated, embedding the ? : makes this hard to read. 'if,else if, else' would be better. >> Thanks for the heads up. However I'm not sure which is the best thing to >> do here. >> >> First, I've been reading the libgpiod code after your email and yes, the >> libgpiod code _could_ return runtime errors received from the gpiochip >> driver, even though the docs state: >> >>> The get/set calls do not return errors because “invalid GPIO”> should have been reported earlier from gpiod_direction_*(). >> (https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html) >> >> On the other hand there are plenty of calls to gpiod_get/set_value in >> the kernel that don't check for error values. I guess this is because >> failures getting/setting a GPIO are very uncommon (perhaps impossible >> with platform GPIO). >> >> When still a GPIO get/set operation fails I'm not sure adding thousands >> of error-checking code lines in hundreds of drivers is the best way to >> go. I feel like we should have a unique, noisy dev_err() in the error >> path in libgpio but I was surprised in not finding any [1]. >> >> Linus, Bartosz, what's your opinion? Should all drivers check for errors >> after every gpiod_[sg]et_value*() call? > > My opinion is that you know the driver / hw is in a bad state and you > > are trying to convey useful information. So you should > > be as careful as possible and not assume gpio did not fail. This patch aims at providing better diagnostics after programming has already gone bad. Neglecting an error might lead to a misleading error message, but this doesn't lead programming to fail -- it has failed already. On the other hand a gpiod_get/set_value() call might fail earlier, along the normal execution path, and lead to real failures without an error message emitted after the gpiod call that failed. Which doesn't mean I'm against your proposal of adding error checking code. Rather, if we want error checking, we want it mainly in other places: at the very least at the first usage of each of the GPIOs, maybe at each usage. Have a look at the beginning of xilinx_spi_write_complete() [0] for example: if gpiod_get_value() fails there the driver would think programming has been successfully completed (DONE asserted). To me this is worse than just printing the wrong error message. [0] https://elixir.bootlin.com/linux/v5.8.2/source/drivers/fpga/xilinx-spi.c#L114
Hi Tom, On 19/08/20 18:32, Luca Ceresoli wrote: > On 18/08/20 16:21, Tom Rix wrote: >> >> On 8/18/20 3:20 AM, Luca Ceresoli wrote: >>> [a question for GPIO maintainers below] >>> >>> Hi Tom, >>> >>> thanks for your review! >>> >>> On 17/08/20 20:15, Tom Rix wrote: >>>> The other two patches are fine. >>>> >>>> On 8/17/20 9:59 AM, Luca Ceresoli wrote: >>>>> When the DONE pin does not go high after programming to confirm programming >>>>> success, the INIT_B pin provides some info on the reason. Use it if >>>>> available to provide a more explanatory error message. >>>>> >>>>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >>>>> --- >>>>> drivers/fpga/xilinx-spi.c | 11 ++++++++++- >>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c >>>>> index 502fae0d1d85..2aa942bb1114 100644 >>>>> --- a/drivers/fpga/xilinx-spi.c >>>>> +++ b/drivers/fpga/xilinx-spi.c >>>>> @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, >>>>> return xilinx_spi_apply_cclk_cycles(conf); >>>>> } >>>>> >>>>> - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); >>>>> + if (conf->init_b) { >>>>> + int init_b_asserted = gpiod_get_value(conf->init_b); >>>> gpiod_get_value can fail. So maybe need split the first statement. >>>> >>>> init_b_asserted < 0 ? "invalid device" >>>> >>>> As the if-else statement is getting complicated, embedding the ? : makes this hard to read. 'if,else if, else' would be better. >>> Thanks for the heads up. However I'm not sure which is the best thing to >>> do here. >>> >>> First, I've been reading the libgpiod code after your email and yes, the >>> libgpiod code _could_ return runtime errors received from the gpiochip >>> driver, even though the docs state: >>> >>>> The get/set calls do not return errors because “invalid GPIO”> should have been reported earlier from gpiod_direction_*(). >>> (https://www.kernel.org/doc/html/latest/driver-api/gpio/consumer.html) >>> >>> On the other hand there are plenty of calls to gpiod_get/set_value in >>> the kernel that don't check for error values. I guess this is because >>> failures getting/setting a GPIO are very uncommon (perhaps impossible >>> with platform GPIO). >>> >>> When still a GPIO get/set operation fails I'm not sure adding thousands >>> of error-checking code lines in hundreds of drivers is the best way to >>> go. I feel like we should have a unique, noisy dev_err() in the error >>> path in libgpio but I was surprised in not finding any [1]. >>> >>> Linus, Bartosz, what's your opinion? Should all drivers check for errors >>> after every gpiod_[sg]et_value*() call? >> >> My opinion is that you know the driver / hw is in a bad state and you >> >> are trying to convey useful information. So you should >> >> be as careful as possible and not assume gpio did not fail. > > This patch aims at providing better diagnostics after programming has > already gone bad. Neglecting an error might lead to a misleading error > message, but this doesn't lead programming to fail -- it has failed already. > > On the other hand a gpiod_get/set_value() call might fail earlier, along > the normal execution path, and lead to real failures without an error > message emitted after the gpiod call that failed. > > Which doesn't mean I'm against your proposal of adding error checking > code. Rather, if we want error checking, we want it mainly in other > places: at the very least at the first usage of each of the GPIOs, maybe > at each usage. Have a look at the beginning of > xilinx_spi_write_complete() [0] for example: if gpiod_get_value() fails > there the driver would think programming has been successfully completed > (DONE asserted). To me this is worse than just printing the wrong error > message. > > [0] > https://elixir.bootlin.com/linux/v5.8.2/source/drivers/fpga/xilinx-spi.c#L114 I added error checking wherever gpiod_get_value() is called to see what happens, and I'm sending a v2 series with this change. The code got longer, but I've kept it still pretty readable. It still feels like a half solution as gpiod_set_value() is void and thus no error checking can be done on it, but let's see yours and other's opinion.
diff --git a/drivers/fpga/xilinx-spi.c b/drivers/fpga/xilinx-spi.c index 502fae0d1d85..2aa942bb1114 100644 --- a/drivers/fpga/xilinx-spi.c +++ b/drivers/fpga/xilinx-spi.c @@ -169,7 +169,16 @@ static int xilinx_spi_write_complete(struct fpga_manager *mgr, return xilinx_spi_apply_cclk_cycles(conf); } - dev_err(&mgr->dev, "Timeout after config data transfer.\n"); + if (conf->init_b) { + int init_b_asserted = gpiod_get_value(conf->init_b); + + dev_err(&mgr->dev, + init_b_asserted ? "CRC error or invalid device\n" + : "Missing sync word or incomplete bitstream\n"); + } else { + dev_err(&mgr->dev, "Timeout after config data transfer.\n"); + } + return -ETIMEDOUT; }
When the DONE pin does not go high after programming to confirm programming success, the INIT_B pin provides some info on the reason. Use it if available to provide a more explanatory error message. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> --- drivers/fpga/xilinx-spi.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)