mbox series

[v1,0/2] fpga: dfl: Log and clear errors on driver init

Message ID 20211019231545.47118-1-russell.h.weight@intel.com (mailing list archive)
Headers show
Series fpga: dfl: Log and clear errors on driver init | expand

Message

Russ Weight Oct. 19, 2021, 11:15 p.m. UTC
These patches address a request to log and clear any prexisting errors on
FPGA cards when the drivers load. Any existing errors will result in print
statements to the kernel error log before the errors are cleared. These
changes specifically affect the fme and port error registers.

Russ Weight (2):
  fpga: dfl: afu: Clear port errors in afu init
  fpga: dfl: fme: Clear fme global errors at driver init

 drivers/fpga/dfl-afu-error.c |  26 ++++---
 drivers/fpga/dfl-fme-error.c | 128 +++++++++++++++++++++++------------
 2 files changed, 100 insertions(+), 54 deletions(-)

Comments

Wu, Hao Oct. 20, 2021, 4:44 a.m. UTC | #1
> Subject: [PATCH v1 0/2] fpga: dfl: Log and clear errors on driver init
> 
> These patches address a request to log and clear any prexisting errors on
> FPGA cards when the drivers load. Any existing errors will result in print
> statements to the kernel error log before the errors are cleared. These
> changes specifically affect the fme and port error registers.

Could you please explain more about why we need this change? 
As we have user interface to log and clear errors already, is it a better choice
to let userspace log and clear them during AFU initialization?

Hao

> 
> Russ Weight (2):
>   fpga: dfl: afu: Clear port errors in afu init
>   fpga: dfl: fme: Clear fme global errors at driver init
> 
>  drivers/fpga/dfl-afu-error.c |  26 ++++---
>  drivers/fpga/dfl-fme-error.c | 128 +++++++++++++++++++++++------------
>  2 files changed, 100 insertions(+), 54 deletions(-)
> 
> --
> 2.25.1
Russ Weight Oct. 21, 2021, 12:05 a.m. UTC | #2
On 10/19/21 9:44 PM, Wu, Hao wrote:
>> Subject: [PATCH v1 0/2] fpga: dfl: Log and clear errors on driver init
>>
>> These patches address a request to log and clear any prexisting errors on
>> FPGA cards when the drivers load. Any existing errors will result in print
>> statements to the kernel error log before the errors are cleared. These
>> changes specifically affect the fme and port error registers.
> Could you please explain more about why we need this change? 
> As we have user interface to log and clear errors already, is it a better choice
> to let userspace log and clear them during AFU initialization?
In the new architecture we are offering more flexibility to customers
for adding functions. With these designs it becomes nearly impossible
to design the AFU interface handler to recover from errors and resume
operation afterwards. The proposed solution is to flag the source of
the error and then capture it in sticky registers so that they can be
read out from SW in the event of a crash/warm boot. To ensure that we
capture these errors, the proposal is to log them in the kernel log and
clear them at driver initialization.

- Russ

> Hao
>
>> Russ Weight (2):
>>   fpga: dfl: afu: Clear port errors in afu init
>>   fpga: dfl: fme: Clear fme global errors at driver init
>>
>>  drivers/fpga/dfl-afu-error.c |  26 ++++---
>>  drivers/fpga/dfl-fme-error.c | 128 +++++++++++++++++++++++------------
>>  2 files changed, 100 insertions(+), 54 deletions(-)
>>
>> --
>> 2.25.1
Wu, Hao Oct. 21, 2021, 12:43 a.m. UTC | #3
> On 10/19/21 9:44 PM, Wu, Hao wrote:
> >> Subject: [PATCH v1 0/2] fpga: dfl: Log and clear errors on driver init
> >>
> >> These patches address a request to log and clear any prexisting errors on
> >> FPGA cards when the drivers load. Any existing errors will result in print
> >> statements to the kernel error log before the errors are cleared. These
> >> changes specifically affect the fme and port error registers.
> > Could you please explain more about why we need this change?
> > As we have user interface to log and clear errors already, is it a better choice
> > to let userspace log and clear them during AFU initialization?
> In the new architecture we are offering more flexibility to customers
> for adding functions. With these designs it becomes nearly impossible
> to design the AFU interface handler to recover from errors and resume
> operation afterwards. The proposed solution is to flag the source of
> the error and then capture it in sticky registers so that they can be
> read out from SW in the event of a crash/warm boot. To ensure that we
> capture these errors, the proposal is to log them in the kernel log and
> clear them at driver initialization.

The error can be logged and cleared at user space driver initialization,
as current usage model is let userspace handle everything including error.

Hao

> 
> - Russ
> 
> > Hao
> >
> >> Russ Weight (2):
> >>   fpga: dfl: afu: Clear port errors in afu init
> >>   fpga: dfl: fme: Clear fme global errors at driver init
> >>
> >>  drivers/fpga/dfl-afu-error.c |  26 ++++---
> >>  drivers/fpga/dfl-fme-error.c | 128 +++++++++++++++++++++++------------
> >>  2 files changed, 100 insertions(+), 54 deletions(-)
> >>
> >> --
> >> 2.25.1