Message ID | a1a9e47a-069c-4285-9899-d5a58cf532c2@CO9EHSMHS003.ehs.local (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Borislav, On 03/02/2014 03:32 PM, Punnaiah Choudary Kalluri wrote: > Add support for ARM Pl310 L2 cache controller parity error > > Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> > --- > .../devicetree/bindings/edac/pl310_edac_l2.txt | 19 ++ > drivers/edac/Kconfig | 7 + > drivers/edac/Makefile | 1 + > drivers/edac/pl310_edac_l2.c | 236 ++++++++++++++++++++ > 4 files changed, 263 insertions(+), 0 deletions(-) > create mode 100644 Documentation/devicetree/bindings/edac/pl310_edac_l2.txt > create mode 100644 drivers/edac/pl310_edac_l2.c Any comment about this driver? Thanks, Michal
On Thu, Apr 03, 2014 at 05:02:30PM +0200, Michal Simek wrote:
> Any comment about this driver?
It is all on the TODO list. I'll take a look after the merge window
closes.
Thanks.
On 04/03/2014 05:24 PM, Borislav Petkov wrote: > On Thu, Apr 03, 2014 at 05:02:30PM +0200, Michal Simek wrote: >> Any comment about this driver? > > It is all on the TODO list. I'll take a look after the merge window > closes. Ok. Thank you, Michal
On Sun, Mar 02, 2014 at 08:02:40PM +0530, Punnaiah Choudary Kalluri wrote: > Add support for ARM Pl310 L2 cache controller parity error > > Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> > --- > .../devicetree/bindings/edac/pl310_edac_l2.txt | 19 ++ > drivers/edac/Kconfig | 7 + > drivers/edac/Makefile | 1 + > drivers/edac/pl310_edac_l2.c | 236 ++++++++++++++++++++ > 4 files changed, 263 insertions(+), 0 deletions(-) > create mode 100644 Documentation/devicetree/bindings/edac/pl310_edac_l2.txt > create mode 100644 drivers/edac/pl310_edac_l2.c > > diff --git a/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt > new file mode 100644 > index 0000000..94fbb8d > --- /dev/null > +++ b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt > @@ -0,0 +1,19 @@ > +Pl310 L2 Cache EDAC driver, it does reports the data and tag ram parity errors. > + > +Required properties: > +- compatible: Should be "arm,pl310-cache". > +- intterupts: Interrupt number to the cpu. > +- reg: Physical base address and size of cache controller's memory mapped > + registers > + > +Example: > +++++++++ > + > + L2: cache-controller { > + compatible = "arm,pl310-cache"; > + interrupts = <0 2 4>; > + reg = <0xf8f02000 0x1000>; > + }; > + > +PL310 L2 Cache EDAC driver detects the Parity enable state by reading the > +appropriate control register. > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 878f090..059ac31 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -326,6 +326,13 @@ config EDAC_TILE > Support for error detection and correction on the > Tilera memory controller. > > +config EDAC_PL310_L2 > + tristate "Pl310 L2 Cache Controller" > + depends on EDAC_MM_EDAC && ARM > + help > + Support for parity error detection on L2 cache controller > + data and tag ram memory > + Ok, so I'm looking at this after having looked at the synopsys thing and it looks very similar in functionality - it does the basic dance of registering and setting up stuff, only using different devicetree nodes, regs, etc. However, it adds a new file under drivers/edac/ and I'm wondering if it wouldn't be better to simply create a xilinx_edac.c and put all your stuff in there, maybe even share code by abstracting it nicely. Having a separate driver only for a single L2 cache controller seems kinda too granulary for me. Thanks.
On Wed, Apr 9, 2014 at 6:32 AM, Borislav Petkov <bp@alien8.de> wrote: > On Sun, Mar 02, 2014 at 08:02:40PM +0530, Punnaiah Choudary Kalluri wrote: >> Add support for ARM Pl310 L2 cache controller parity error >> >> Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> >> --- >> .../devicetree/bindings/edac/pl310_edac_l2.txt | 19 ++ >> drivers/edac/Kconfig | 7 + >> drivers/edac/Makefile | 1 + >> drivers/edac/pl310_edac_l2.c | 236 ++++++++++++++++++++ >> 4 files changed, 263 insertions(+), 0 deletions(-) >> create mode 100644 Documentation/devicetree/bindings/edac/pl310_edac_l2.txt >> create mode 100644 drivers/edac/pl310_edac_l2.c >> >> diff --git a/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt >> new file mode 100644 >> index 0000000..94fbb8d >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt >> @@ -0,0 +1,19 @@ >> +Pl310 L2 Cache EDAC driver, it does reports the data and tag ram parity errors. >> + >> +Required properties: >> +- compatible: Should be "arm,pl310-cache". >> +- intterupts: Interrupt number to the cpu. >> +- reg: Physical base address and size of cache controller's memory mapped >> + registers >> + >> +Example: >> +++++++++ >> + >> + L2: cache-controller { >> + compatible = "arm,pl310-cache"; >> + interrupts = <0 2 4>; >> + reg = <0xf8f02000 0x1000>; >> + }; >> + >> +PL310 L2 Cache EDAC driver detects the Parity enable state by reading the >> +appropriate control register. >> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig >> index 878f090..059ac31 100644 >> --- a/drivers/edac/Kconfig >> +++ b/drivers/edac/Kconfig >> @@ -326,6 +326,13 @@ config EDAC_TILE >> Support for error detection and correction on the >> Tilera memory controller. >> >> +config EDAC_PL310_L2 >> + tristate "Pl310 L2 Cache Controller" >> + depends on EDAC_MM_EDAC && ARM >> + help >> + Support for parity error detection on L2 cache controller >> + data and tag ram memory >> + > > > Ok, so I'm looking at this after having looked at the synopsys thing > and it looks very similar in functionality - it does the basic dance of > registering and setting up stuff, only using different devicetree nodes, > regs, etc. > > However, it adds a new file under drivers/edac/ and I'm wondering if it > wouldn't be better to simply create a xilinx_edac.c and put all your > stuff in there, maybe even share code by abstracting it nicely. Having > a separate driver only for a single L2 cache controller seems kinda too > granulary for me. I don't think so, the PL310 is present on lots of ARM chips besides Xilinx. I don't know how many support parity as that is optional. In fact the highbank_l2_edac.c is for the PL310 as well, but the registers it uses is all custom logic added for ECC and there is no part of the PL310 h/w used by the driver. If there is lots duplication, then that's a sign the framework needs to handle more of the boilerplate pieces. There could be a "simple" driver/library for devices which are no more than some registers, an interrupt handler and static information about the type of EDAC device. Rob
On Wed, Apr 09, 2014 at 08:18:28AM -0500, Rob Herring wrote: > I don't think so, the PL310 is present on lots of ARM chips besides > Xilinx. I don't know how many support parity as that is optional. In > fact the highbank_l2_edac.c is for the PL310 as well, but the > registers it uses is all custom logic added for ECC and there is no > part of the PL310 h/w used by the driver. Oh ok, so highbank_l2 and PL310 could theoretically be merged together in one compilation unit, even if they don't really share code at all... > If there is lots duplication, then that's a sign the framework needs > to handle more of the boilerplate pieces. There could be a "simple" > driver/library for devices which are no more than some registers, an > interrupt handler and static information about the type of EDAC > device. Yeah, it's not that - I'm just getting worried that I'm receiving an EDAC driver for each piece of silicon out there and would like to still keep drivers/edac/ sane and be able to control that wild growth. I'm just thinking out loud here, bear with me pls: Frankly, having a single compilation unit contain similar silicon functionality could be a good way to put a hold on the growth but the disadvantage of this is fatter drivers. Which wouldn't matter all too much but after a certain level of fat, they might need splitting. And the highbank version is nothing but the big probe routine and a small irq handler. And the PL310 is similar but also with a poller. I guess, if they don't share functionality at all, putting them together might not be worth it. Hohummm. Thanks.
There is a driver file cache-l2x0.c under arch/arm/mm for pl310 cache configuration and management. Russel king had suggested to use single driver file for both pl310 edac implementation and cache-l2x0.c Here is the thread http://www.spinics.net/lists/arm-kernel/msg320407.html please provide your inputs Thanks, Punnaiah On Wed, Apr 9, 2014 at 8:49 PM, Borislav Petkov <bp@alien8.de> wrote: > On Wed, Apr 09, 2014 at 08:18:28AM -0500, Rob Herring wrote: >> I don't think so, the PL310 is present on lots of ARM chips besides >> Xilinx. I don't know how many support parity as that is optional. In >> fact the highbank_l2_edac.c is for the PL310 as well, but the >> registers it uses is all custom logic added for ECC and there is no >> part of the PL310 h/w used by the driver. > > Oh ok, so highbank_l2 and PL310 could theoretically be merged together > in one compilation unit, even if they don't really share code at all... > >> If there is lots duplication, then that's a sign the framework needs >> to handle more of the boilerplate pieces. There could be a "simple" >> driver/library for devices which are no more than some registers, an >> interrupt handler and static information about the type of EDAC >> device. > > Yeah, it's not that - I'm just getting worried that I'm receiving an > EDAC driver for each piece of silicon out there and would like to still > keep drivers/edac/ sane and be able to control that wild growth. > > I'm just thinking out loud here, bear with me pls: > > Frankly, having a single compilation unit contain similar silicon > functionality could be a good way to put a hold on the growth but the > disadvantage of this is fatter drivers. Which wouldn't matter all too > much but after a certain level of fat, they might need splitting. > > And the highbank version is nothing but the big probe routine and a > small irq handler. > > And the PL310 is similar but also with a poller. > > I guess, if they don't share functionality at all, putting them together > might not be worth it. Hohummm. > > Thanks. > > -- > Regards/Gruss, > Boris. > > Sent from a fat crate under my desk. Formatting is fine. > --
On Wed, Apr 09, 2014 at 10:59:49PM +0530, Punnaiah Choudary wrote: > There is a driver file cache-l2x0.c under arch/arm/mm for pl310 cache > configuration and management. Russel king had suggested to use > single driver file for both pl310 edac implementation and cache-l2x0.c > Here is the thread > http://www.spinics.net/lists/arm-kernel/msg320407.html > > please provide your inputs Well, it is very simple - you don't really need an EDAC compilation unit just so that you can report errors - you can very well do that in your cache-l2x0.c and boom, problem solved. Simply move the irq handler and the parity error checker to that file. :-)
On 04/09/2014 07:47 PM, Borislav Petkov wrote: > On Wed, Apr 09, 2014 at 10:59:49PM +0530, Punnaiah Choudary wrote: >> There is a driver file cache-l2x0.c under arch/arm/mm for pl310 cache >> configuration and management. Russel king had suggested to use >> single driver file for both pl310 edac implementation and cache-l2x0.c >> Here is the thread >> http://www.spinics.net/lists/arm-kernel/msg320407.html >> >> please provide your inputs > > Well, it is very simple - you don't really need an EDAC compilation unit > just so that you can report errors - you can very well do that in your > cache-l2x0.c and boom, problem solved. > > Simply move the irq handler and the parity error checker to that file. I am just curious about this recommendation. Does it mean that we shouldn't use edac interface just for reporting problems? I didn't play with it but I guess that there is a record about edac driver via sysfs, etc. You could read some useful information that you know what it is protected and I hope that edac interface bring any value to reporting errors comparing to just printk message from IRQ handler. My question is if using printk in IRQ handler and report problem is equal to reporting this via edac interface. Or it is just easy way to do but using edac interface is right solution and how to do it properly is different question. Thanks, Michal
On Thu, Apr 10, 2014 at 08:12:17AM +0200, Michal Simek wrote: > I am just curious about this recommendation. Does it mean that we > shouldn't use edac interface just for reporting problems? Yes, we should. But, if you want to have two different drivers accessing the hardware and have to build synchronization around it, I'm simply putting the simplest solution on the table in case it is viable. If all you want to do is report errors to dmesg, then you don't need the edac stuff. > I didn't play with it but I guess that there is a record about edac > driver via sysfs, etc. You could read some useful information that you > know what it is protected and I hope that edac interface bring any > value to reporting errors comparing to just printk message from IRQ > handler. Yes, you can read error counts and DIMM layout. I'm adding a full dump of sysfs on my machine at the end. Take a look at it and think whether this could be something you could use/need/like. But, if you have only one DIMM slot available on the hw (I'm not even going to pretend to know your possible DIMM layouts) then maybe EDAC is not needed at all - people would know which DIMM is at fault :-) And so on and so on. > My question is if using printk in IRQ handler and report problem is > equal to reporting this via edac interface. Or it is just easy way > to do but using edac interface is right solution and how to do it > properly is different question. Yeah, it would probably be easier if you would point out first what you want to use the edac interface for and we can tell you whether it's already there/doable/makes sense, etc. Thanks. /sys/devices/system/edac/mc/mc0/dbam:0x0000000000000077 /sys/devices/system/edac/mc/mc0/dhar:0x00000000c0004003 /sys/devices/system/edac/mc/mc0/inject_ecc_vector:0x0 /sys/devices/system/edac/mc/mc0/seconds_since_reset:169051 /sys/devices/system/edac/mc/mc0/ce_noinfo_count:0 /sys/devices/system/edac/mc/mc0/power/async:disabled /sys/devices/system/edac/mc/mc0/rank0/size:2048 /sys/devices/system/edac/mc/mc0/rank0/power/async:disabled /sys/devices/system/edac/mc/mc0/rank0/dimm_location:csrow 0 channel 0 /sys/devices/system/edac/mc/mc0/rank0/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank0/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank0/dimm_label:mc#0csrow#0channel#0 /sys/devices/system/edac/mc/mc0/rank0/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank1/size:2048 /sys/devices/system/edac/mc/mc0/rank1/power/async:disabled /sys/devices/system/edac/mc/mc0/rank1/dimm_location:csrow 0 channel 1 /sys/devices/system/edac/mc/mc0/rank1/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank1/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank1/dimm_label:mc#0csrow#0channel#1 /sys/devices/system/edac/mc/mc0/rank1/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank2/size:2048 /sys/devices/system/edac/mc/mc0/rank2/power/async:disabled /sys/devices/system/edac/mc/mc0/rank2/dimm_location:csrow 1 channel 0 /sys/devices/system/edac/mc/mc0/rank2/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank2/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank2/dimm_label:mc#0csrow#1channel#0 /sys/devices/system/edac/mc/mc0/rank2/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank3/size:2048 /sys/devices/system/edac/mc/mc0/rank3/power/async:disabled /sys/devices/system/edac/mc/mc0/rank3/dimm_location:csrow 1 channel 1 /sys/devices/system/edac/mc/mc0/rank3/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank3/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank3/dimm_label:mc#0csrow#1channel#1 /sys/devices/system/edac/mc/mc0/rank3/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank4/size:2048 /sys/devices/system/edac/mc/mc0/rank4/power/async:disabled /sys/devices/system/edac/mc/mc0/rank4/dimm_location:csrow 2 channel 0 /sys/devices/system/edac/mc/mc0/rank4/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank4/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank4/dimm_label:mc#0csrow#2channel#0 /sys/devices/system/edac/mc/mc0/rank4/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank5/size:2048 /sys/devices/system/edac/mc/mc0/rank5/power/async:disabled /sys/devices/system/edac/mc/mc0/rank5/dimm_location:csrow 2 channel 1 /sys/devices/system/edac/mc/mc0/rank5/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank5/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank5/dimm_label:mc#0csrow#2channel#1 /sys/devices/system/edac/mc/mc0/rank5/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank6/size:2048 /sys/devices/system/edac/mc/mc0/rank6/power/async:disabled /sys/devices/system/edac/mc/mc0/rank6/dimm_location:csrow 3 channel 0 /sys/devices/system/edac/mc/mc0/rank6/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank6/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank6/dimm_label:mc#0csrow#3channel#0 /sys/devices/system/edac/mc/mc0/rank6/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/rank7/size:2048 /sys/devices/system/edac/mc/mc0/rank7/power/async:disabled /sys/devices/system/edac/mc/mc0/rank7/dimm_location:csrow 3 channel 1 /sys/devices/system/edac/mc/mc0/rank7/dimm_dev_type:Unknown /sys/devices/system/edac/mc/mc0/rank7/dimm_mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/rank7/dimm_label:mc#0csrow#3channel#1 /sys/devices/system/edac/mc/mc0/rank7/dimm_edac_mode:SECDED /sys/devices/system/edac/mc/mc0/sdram_scrub_rate:195300 /sys/devices/system/edac/mc/mc0/inject_word:0x0 /sys/devices/system/edac/mc/mc0/size_mb:16384 /sys/devices/system/edac/mc/mc0/max_location:csrow 7 channel 1 /sys/devices/system/edac/mc/mc0/mc_name:F15h /sys/devices/system/edac/mc/mc0/csrow0/power/async:disabled /sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label:mc#0csrow#0channel#0 /sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label:mc#0csrow#0channel#1 /sys/devices/system/edac/mc/mc0/csrow0/size_mb:4096 /sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ue_count:0 /sys/devices/system/edac/mc/mc0/csrow0/dev_type:Unknown /sys/devices/system/edac/mc/mc0/csrow0/ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/edac_mode:SECDED /sys/devices/system/edac/mc/mc0/csrow0/mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/csrow1/power/async:disabled /sys/devices/system/edac/mc/mc0/csrow1/ch0_dimm_label:mc#0csrow#1channel#0 /sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow1/ch1_dimm_label:mc#0csrow#1channel#1 /sys/devices/system/edac/mc/mc0/csrow1/size_mb:4096 /sys/devices/system/edac/mc/mc0/csrow1/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow1/ue_count:0 /sys/devices/system/edac/mc/mc0/csrow1/dev_type:Unknown /sys/devices/system/edac/mc/mc0/csrow1/ce_count:0 /sys/devices/system/edac/mc/mc0/csrow1/edac_mode:SECDED /sys/devices/system/edac/mc/mc0/csrow1/mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/csrow2/power/async:disabled /sys/devices/system/edac/mc/mc0/csrow2/ch0_dimm_label:mc#0csrow#2channel#0 /sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow2/ch1_dimm_label:mc#0csrow#2channel#1 /sys/devices/system/edac/mc/mc0/csrow2/size_mb:4096 /sys/devices/system/edac/mc/mc0/csrow2/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow2/ue_count:0 /sys/devices/system/edac/mc/mc0/csrow2/dev_type:Unknown /sys/devices/system/edac/mc/mc0/csrow2/ce_count:0 /sys/devices/system/edac/mc/mc0/csrow2/edac_mode:SECDED /sys/devices/system/edac/mc/mc0/csrow2/mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/csrow3/power/async:disabled /sys/devices/system/edac/mc/mc0/csrow3/ch0_dimm_label:mc#0csrow#3channel#0 /sys/devices/system/edac/mc/mc0/csrow3/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow3/ch1_dimm_label:mc#0csrow#3channel#1 /sys/devices/system/edac/mc/mc0/csrow3/size_mb:4096 /sys/devices/system/edac/mc/mc0/csrow3/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow3/ue_count:0 /sys/devices/system/edac/mc/mc0/csrow3/dev_type:Unknown /sys/devices/system/edac/mc/mc0/csrow3/ce_count:0 /sys/devices/system/edac/mc/mc0/csrow3/edac_mode:SECDED /sys/devices/system/edac/mc/mc0/csrow3/mem_type:Unbuffered-DDR3 /sys/devices/system/edac/mc/mc0/topmem2:0x000000043f000000 /sys/devices/system/edac/mc/mc0/ue_count:0 /sys/devices/system/edac/mc/mc0/topmem:0x00000000c0000000 /sys/devices/system/edac/mc/mc0/ue_noinfo_count:0 /sys/devices/system/edac/mc/mc0/ce_count:0 /sys/devices/system/edac/mc/mc0/dram_hole:c0000000 40000000 40000000 /sys/devices/system/edac/mc/mc0/inject_section:0x0 /sys/devices/system/edac/mc/power/async:disabled /sys/devices/system/edac/pci/pci0/pe_count:0 /sys/devices/system/edac/pci/pci0/npe_count:0 /sys/devices/system/edac/pci/edac_pci_log_npe:1 /sys/devices/system/edac/pci/edac_pci_panic_on_pe:0 /sys/devices/system/edac/pci/edac_pci_log_pe:1 /sys/devices/system/edac/pci/pci_nonparity_count:0 /sys/devices/system/edac/pci/check_pci_errors:0 /sys/devices/system/edac/pci/pci_parity_count:0 /sys/devices/system/edac/power/async:disabled
Hi Borislav, >> My question is if using printk in IRQ handler and report problem is >> equal to reporting this via edac interface. Or it is just easy way >> to do but using edac interface is right solution and how to do it >> properly is different question. > > Yeah, it would probably be easier if you would point out first what you > want to use the edac interface for and we can tell you whether it's > already there/doable/makes sense, etc. OK. This is the v1 that's why we are having this discussion now and I really appreciate your recommendation. It wasn't too complicated to create this driver that's why not big deal and it is always better to talk about the code and how to handle it properly. I had a call with Punnaiah regarding this L2 edac driver and please correct me if my understanding of pl310 parity error is wrong (I didn't read pl310 specification). There are 2 memories in pl310 - data and tag, Through controller we are able to detect 2 errors: data parity error and tag parity error. Both of them are uncorrectable errors and could be just reported. L2 contains data and instructions together that's why error can happen in data or instruction part. The question here is. This driver is just reporting problem through edac interface which is counting that errors and provide an unified way how to report problems. Maybe as you said we don't need to use edac interface at all but by design because every error means that there is the problem and error should be reported and system should be reset because we just don't know where the problem is. We know that we have a problem. The question also is if we should execute any code because the problem can be with instructions and system should just reset. Isn't there any security issue that even executing any code is a problem? From the code I see that IRQ can be raised also for different things because only errors are handled here (BTW: IRQ_HANDLED should be return when IRQ is actually handled). I just want to make sure that edac is right interface, we have right reactions and this is how to handle this problem. Thanks, Michal
On Thu, Apr 10, 2014 at 12:09:03PM +0200, Michal Simek wrote: > The question here is. This driver is just reporting problem through > edac interface which is counting that errors and provide an unified > way how to report problems. Yes, normally you can use edac for reporting and error counting. But, if, as I said earlier, it is easier to solve your issue of having two entities touch one hardware and synchronizing around it is too much, just for this one case, you can simply report the errors with simple printk, without the edac interface. This is why I was asking the practical question of why do you even need the edac interface? If it is only for reporting, use printk and solve the problem of having two drivers. > Maybe as you said we don't need to use edac interface at all but by > design because every error means that there is the problem and error > should be reported and system should be reset because we just don't > know where the problem is. We know that we have a problem. > > The question also is if we should execute any code because the problem > can be with instructions and system should just reset. > > Isn't there any security issue that even executing any code is a > problem? Well, this is up to you to answer. If an UE (Uncorrectable Error) causes data to get corrupted on your system, which, as a result, corrupts visible state which lands on storage, you definitely want to stop executing any code. x86 deals very rigorously with errors like those by running an exception handler, on AMD there's also this thing called syncflood which stops any execution and a warm reset happens. So you have to think hard what those UEs cause on your systems and only then act accordingly. If something bad like the above happens, the last thing you want to do is report them to dmesg. HTH.
diff --git a/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt new file mode 100644 index 0000000..94fbb8d --- /dev/null +++ b/Documentation/devicetree/bindings/edac/pl310_edac_l2.txt @@ -0,0 +1,19 @@ +Pl310 L2 Cache EDAC driver, it does reports the data and tag ram parity errors. + +Required properties: +- compatible: Should be "arm,pl310-cache". +- intterupts: Interrupt number to the cpu. +- reg: Physical base address and size of cache controller's memory mapped + registers + +Example: +++++++++ + + L2: cache-controller { + compatible = "arm,pl310-cache"; + interrupts = <0 2 4>; + reg = <0xf8f02000 0x1000>; + }; + +PL310 L2 Cache EDAC driver detects the Parity enable state by reading the +appropriate control register. diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 878f090..059ac31 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -326,6 +326,13 @@ config EDAC_TILE Support for error detection and correction on the Tilera memory controller. +config EDAC_PL310_L2 + tristate "Pl310 L2 Cache Controller" + depends on EDAC_MM_EDAC && ARM + help + Support for parity error detection on L2 cache controller + data and tag ram memory + config EDAC_HIGHBANK_MC tristate "Highbank Memory Controller" depends on EDAC_MM_EDAC && ARCH_HIGHBANK diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 4154ed6..179356f 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -64,3 +64,4 @@ obj-$(CONFIG_EDAC_OCTEON_PC) += octeon_edac-pc.o obj-$(CONFIG_EDAC_OCTEON_L2C) += octeon_edac-l2c.o obj-$(CONFIG_EDAC_OCTEON_LMC) += octeon_edac-lmc.o obj-$(CONFIG_EDAC_OCTEON_PCI) += octeon_edac-pci.o +obj-$(CONFIG_EDAC_PL310_L2) += pl310_edac_l2.o diff --git a/drivers/edac/pl310_edac_l2.c b/drivers/edac/pl310_edac_l2.c new file mode 100644 index 0000000..649371a --- /dev/null +++ b/drivers/edac/pl310_edac_l2.c @@ -0,0 +1,236 @@ +/* + * Pl310 L2 Cache EDAC Driver + * + * Copyright (C) 2013-2014 Xilinx, Inc. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see <http://www.gnu.org/licenses/>. + */ +#include <linux/edac.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/platform_device.h> +#include <asm/hardware/cache-l2x0.h> +#include "edac_core.h" + +/* Auxilary control register definitions */ +#define L2X0_AUX_CTRL_PARITY_MASK BIT(21) + +/* Interrupt imask/status/clear register definitions */ +#define L2X0_INTR_PARRD_MASK 0x4 +#define L2X0_INTR_PARRT_MASK 0x2 + +/** + * struct pl310_edac_l2_priv - Zynq L2 cache controller private instance data + * @base: Base address of the controller + * @irq: Interrupt number + */ +struct pl310_edac_l2_priv { + void __iomem *base; + int irq; +}; + +/** + * pl310_edac_l2_parityerr_check - Check controller staus for parity errors + * @dci: Pointer to the edac device controller instance + * + * This routine is used to check and post parity errors + */ +static void pl310_edac_l2_parityerr_check(struct edac_device_ctl_info *dci) +{ + struct pl310_edac_l2_priv *priv = dci->pvt_info; + u32 regval; + + regval = readl(priv->base + L2X0_RAW_INTR_STAT); + if (regval & L2X0_INTR_PARRD_MASK) { + /* Data parity error will be reported as correctable error */ + writel(L2X0_INTR_PARRD_MASK, priv->base + L2X0_INTR_CLEAR); + edac_device_handle_ce(dci, 0, 0, dci->ctl_name); + } + if (regval & L2X0_INTR_PARRT_MASK) { + /* tag parity error will be reported as uncorrectable error */ + writel(L2X0_INTR_PARRT_MASK, priv->base + L2X0_INTR_CLEAR); + edac_device_handle_ue(dci, 0, 0, dci->ctl_name); + } +} + +/** + * pl310_edac_l2_int_handler - ISR fucntion for l2cahe controller + * @irq: Irq Number + * @device: Pointer to the edac device controller instance + * + * This routine is triggered whenever there is parity error detected + * + * Return: Always returns IRQ_HANDLED + */ +static irqreturn_t pl310_edac_l2_int_handler(int irq, void *device) +{ + pl310_edac_l2_parityerr_check((struct edac_device_ctl_info *)device); + return IRQ_HANDLED; +} + +/** + * pl310_edac_l2_poll_handler - Poll the status reg for parity errors + * @dci: Pointer to the edac device controller instance + * + * This routine is used to check and post parity errors and is called by + * the EDAC polling thread + */ +static void pl310_edac_l2_poll_handler(struct edac_device_ctl_info *dci) +{ + pl310_edac_l2_parityerr_check(dci); +} + +/** + * pl310_edac_l2_get_paritystate - check the parity enable/disable status + * @base: Pointer to the contoller base address + * + * This routine returns the parity enable/diable status for the controller + * + * Return: true/false - parity enabled/disabled. + */ +static bool pl310_edac_l2_get_paritystate(void __iomem *base) +{ + u32 regval; + + regval = readl(base + L2X0_AUX_CTRL); + if (regval & L2X0_AUX_CTRL_PARITY_MASK) + return true; + + return false; +} + +/** + * pl310_edac_l2_probe - Check controller and bind driver + * @pdev: Pointer to the platform_device struct + * + * This routine probes a specific arm,pl310-cache instance for binding + * with the driver. + * + * Return: 0 if the controller instance was successfully bound to the + * driver; otherwise, < 0 on error. + */ +static int pl310_edac_l2_probe(struct platform_device *pdev) +{ + struct edac_device_ctl_info *dci; + struct pl310_edac_l2_priv *priv; + int rc; + struct resource *res; + void __iomem *baseaddr; + u32 regval; + + /* Get the data from the platform device */ + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + baseaddr = devm_ioremap_resource(&pdev->dev, res); + if (IS_ERR(baseaddr)) + return PTR_ERR(baseaddr); + + /* Check for the ecc enable status */ + if (pl310_edac_l2_get_paritystate(baseaddr) == false) { + dev_err(&pdev->dev, "parity check not enabled\n"); + return -ENXIO; + } + + dci = edac_device_alloc_ctl_info(sizeof(*priv), "l2cache", + 1, "L", 1, 1, NULL, 0, 0); + if (IS_ERR(dci)) + return PTR_ERR(dci); + + priv = dci->pvt_info; + priv->base = baseaddr; + dci->dev = &pdev->dev; + dci->mod_name = "pl310_edac_l2"; + dci->ctl_name = "pl310_l2_controller"; + dci->dev_name = dev_name(&pdev->dev); + + priv->irq = platform_get_irq(pdev, 0); + rc = devm_request_irq(&pdev->dev, priv->irq, + pl310_edac_l2_int_handler, + 0, dev_name(&pdev->dev), (void *)dci); + if (rc < 0) { + dci->edac_check = pl310_edac_l2_poll_handler; + edac_op_state = EDAC_OPSTATE_POLL; + } + + rc = edac_device_add_device(dci); + if (rc) { + dev_err(&pdev->dev, "failed to register with EDAC core\n"); + goto del_edac_device; + } + + if (edac_op_state != EDAC_OPSTATE_POLL) { + regval = readl(priv->base+L2X0_INTR_MASK); + regval |= (L2X0_INTR_PARRD_MASK | L2X0_INTR_PARRT_MASK); + writel(regval, priv->base+L2X0_INTR_MASK); + } + + return rc; + +del_edac_device: + edac_device_del_device(&pdev->dev); + edac_device_free_ctl_info(dci); + + return rc; +} + +/** + * pl310_edac_l2_remove - Unbind driver from controller + * @pdev: Pointer to the platform_device struct + * + * This routine unbinds the EDAC device controller instance associated + * with the specified arm,pl310-cache controller described by the + * OpenFirmware device tree node passed as a parameter. + * + * Return: Always returns 0 + */ +static int pl310_edac_l2_remove(struct platform_device *pdev) +{ + struct edac_device_ctl_info *dci = platform_get_drvdata(pdev); + struct pl310_edac_l2_priv *priv = dci->pvt_info; + u32 regval; + + if (edac_op_state != EDAC_OPSTATE_POLL) { + regval = readl(priv->base+L2X0_INTR_MASK); + regval &= ~(L2X0_INTR_PARRD_MASK | L2X0_INTR_PARRT_MASK); + writel(regval, priv->base+L2X0_INTR_MASK); + } + + edac_device_del_device(&pdev->dev); + edac_device_free_ctl_info(dci); + + return 0; +} + +/* Device tree node type and compatible tuples this driver can match on */ +static struct of_device_id pl310_edac_l2_match[] = { + { .compatible = "arm,pl310-cache", }, + { /* end of table */ } +}; + +MODULE_DEVICE_TABLE(of, pl310_edac_l2_match); + +static struct platform_driver pl310_edac_l2_driver = { + .driver = { + .name = "pl310-edac-l2", + .owner = THIS_MODULE, + .of_match_table = pl310_edac_l2_match, + }, + .probe = pl310_edac_l2_probe, + .remove = pl310_edac_l2_remove, +}; + +module_platform_driver(pl310_edac_l2_driver); + +MODULE_AUTHOR("Xilinx Inc."); +MODULE_DESCRIPTION("pl310 L2 EDAC driver"); +MODULE_LICENSE("GPL v2");
Add support for ARM Pl310 L2 cache controller parity error Signed-off-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> --- .../devicetree/bindings/edac/pl310_edac_l2.txt | 19 ++ drivers/edac/Kconfig | 7 + drivers/edac/Makefile | 1 + drivers/edac/pl310_edac_l2.c | 236 ++++++++++++++++++++ 4 files changed, 263 insertions(+), 0 deletions(-) create mode 100644 Documentation/devicetree/bindings/edac/pl310_edac_l2.txt create mode 100644 drivers/edac/pl310_edac_l2.c