diff mbox

ARM: mm: Fix ECC mem policy printk

Message ID 8986e8f1a3761e45a7927bdb0e54393c9155e6bf.1383137171.git.michal.simek@xilinx.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michal Simek Oct. 30, 2013, 12:46 p.m. UTC
ECC policy can be applied to the whole system
when this bit is implemented by SoC vendor
(IMP - bit 9 - in L1 page table entry format).
When this bit is not implemented by SoC vendor
it doesn't mean that system has no other way
how to do ECC.
This patch ensures to show this message only when ECC
is requested via cmd line ecc=on and runs on
appropriate ARM core.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
---
Russell, Will: We discussed this at KS that will be good
to rephrase it or have different logic around this.
I am not sure if we can also test that this bit is
implemented by particular SoC or not.

Maybe logic should be that if SoC uses this bit
that message is shown in origin format to declare
that ECC is enabled or disabled.
When SoC doesn't implement it then do not show this message.

---
 arch/arm/mm/mmu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--
1.8.2.3

Comments

Russell King - ARM Linux Oct. 30, 2013, 1:07 p.m. UTC | #1
On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
> Russell, Will: We discussed this at KS that will be good
> to rephrase it or have different logic around this.
> I am not sure if we can also test that this bit is
> implemented by particular SoC or not.
> 
> Maybe logic should be that if SoC uses this bit
> that message is shown in origin format to declare
> that ECC is enabled or disabled.
> When SoC doesn't implement it then do not show this message.

This is not quite what I meant - by making the change you have, you also
omit to print the data cache policy.

> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
>  		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
>  		break;
>  	}
> -	printk("Memory policy: ECC %sabled, Data cache %s\n",
> -		ecc_mask ? "en" : "dis", cp->policy);
> +	if (ecc_mask)
> +		pr_info("Memory policy: ECC enabled, Data cache %s\n",
> +			cp->policy);

	pr_info("Memory policy: %sData cache %s\n",
		ecc_mask ? "ECC enabled, " : "", cp->policy);

is more what I was suggesting.
Michal Simek Oct. 30, 2013, 2:23 p.m. UTC | #2
On 10/30/2013 02:07 PM, Russell King - ARM Linux wrote:
> On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
>> Russell, Will: We discussed this at KS that will be good
>> to rephrase it or have different logic around this.
>> I am not sure if we can also test that this bit is
>> implemented by particular SoC or not.
>>
>> Maybe logic should be that if SoC uses this bit
>> that message is shown in origin format to declare
>> that ECC is enabled or disabled.
>> When SoC doesn't implement it then do not show this message.
> 
> This is not quite what I meant - by making the change you have, you also
> omit to print the data cache policy.
> 
>> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
>>  		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
>>  		break;
>>  	}
>> -	printk("Memory policy: ECC %sabled, Data cache %s\n",
>> -		ecc_mask ? "en" : "dis", cp->policy);
>> +	if (ecc_mask)
>> +		pr_info("Memory policy: ECC enabled, Data cache %s\n",
>> +			cp->policy);
> 
> 	pr_info("Memory policy: %sData cache %s\n",
> 		ecc_mask ? "ECC enabled, " : "", cp->policy);
> 
> is more what I was suggesting.

If this is what you would like to see it there, I am fine with that too.

Thanks,
Michal
Michal Simek Oct. 30, 2013, 2:32 p.m. UTC | #3
On 10/30/2013 03:23 PM, Michal Simek wrote:
> On 10/30/2013 02:07 PM, Russell King - ARM Linux wrote:
>> On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
>>> Russell, Will: We discussed this at KS that will be good
>>> to rephrase it or have different logic around this.
>>> I am not sure if we can also test that this bit is
>>> implemented by particular SoC or not.
>>>
>>> Maybe logic should be that if SoC uses this bit
>>> that message is shown in origin format to declare
>>> that ECC is enabled or disabled.
>>> When SoC doesn't implement it then do not show this message.
>>
>> This is not quite what I meant - by making the change you have, you also
>> omit to print the data cache policy.
>>
>>> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
>>>  		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
>>>  		break;
>>>  	}
>>> -	printk("Memory policy: ECC %sabled, Data cache %s\n",
>>> -		ecc_mask ? "en" : "dis", cp->policy);
>>> +	if (ecc_mask)
>>> +		pr_info("Memory policy: ECC enabled, Data cache %s\n",
>>> +			cp->policy);
>>
>> 	pr_info("Memory policy: %sData cache %s\n",
>> 		ecc_mask ? "ECC enabled, " : "", cp->policy);
>>
>> is more what I was suggesting.
> 
> If this is what you would like to see it there, I am fine with that too.

btw: passing ecc=on through command line will caused that "ECC enabled" message
will be there even on systems which don't implement this bit.
It is just side effect for both these solutions.
Isn't there any easy way to test if this bit is implemented or not just by setting
it up and clear it?

Thanks,
Michal
Russell King - ARM Linux Oct. 30, 2013, 3:01 p.m. UTC | #4
On Wed, Oct 30, 2013 at 03:32:09PM +0100, Michal Simek wrote:
> btw: passing ecc=on through command line will caused that "ECC enabled"
> message will be there even on systems which don't implement this bit.
> It is just side effect for both these solutions.

It is a hint, nothing more.  There is no way to detect whether it's
implemented or even how it has been implemented.

> Isn't there any easy way to test if this bit is implemented or not just
> by setting it up and clear it?

So... let's summerise the message that you're giving.

"My SoC doesn't implement this bit other than to provide ECC at the L1
cache, instead implementing a separate ECC scheme for system memory.
Therefore, I want to change it to describe my implementation, because
my customers are complaining that it says ECC is disabled when that
is not the case.  If it can't describe my setup, I want to remove the
whole facility."

That's a very selfish attitude.  Sorry, but it would be wrong of me
to allow your situation to change what we have beyond the proposed
patch.

I've shown you the ARM architecture reference manual where this bit in
the page tables is described, both older and newer versions.  What we're
doing is in the spirit of the descriptions of bit 9 in the L1 page tables.

I don't think there's any sensible short description which would
adequately describe this setting which would satisfy both your situation
and situations on other SoCs.  We could make the kernel print an entire
paragraph on it, something like:

"ECC might be %sabled.  The exact ECC setting depends on how your SoC
is implemented.  Please refer to your SoCs technical reference manual
for a description of bit 9 in the level one page tables for further
information on how to interpret this statement."

but that would be idiotic.

Of course, we could just print nothing, but the purpose of printing this
is so that _we_ as developers looking at the kernel messages know the
status of this bit, particularly when interpreting oops dumps.  Hiding
this information would make some oops dumps harder to diagnose.  So...
this is a matter for user education if your users are complaining about
it.
Michal Simek Oct. 30, 2013, 3:14 p.m. UTC | #5
Hi Russell,

On 10/30/2013 04:01 PM, Russell King - ARM Linux wrote:
> On Wed, Oct 30, 2013 at 03:32:09PM +0100, Michal Simek wrote:
>> btw: passing ecc=on through command line will caused that "ECC enabled"
>> message will be there even on systems which don't implement this bit.
>> It is just side effect for both these solutions.
> 
> It is a hint, nothing more.  There is no way to detect whether it's
> implemented or even how it has been implemented.

ok. That's what I wanted to know.


>> Isn't there any easy way to test if this bit is implemented or not just
>> by setting it up and clear it?
> 
> So... let's summerise the message that you're giving.
> 
> "My SoC doesn't implement this bit other than to provide ECC at the L1
> cache, instead implementing a separate ECC scheme for system memory.
> Therefore, I want to change it to describe my implementation, because
> my customers are complaining that it says ECC is disabled when that
> is not the case.  If it can't describe my setup, I want to remove the
> whole facility."
> 
> That's a very selfish attitude.  Sorry, but it would be wrong of me
> to allow your situation to change what we have beyond the proposed
> patch.

I thought the situation is quite clear here. I am just saying
that there is a way to get it back and it is task for us to educate
our users/customers how to get ecc to work on zynq.

> 
> I've shown you the ARM architecture reference manual where this bit in
> the page tables is described, both older and newer versions.  What we're
> doing is in the spirit of the descriptions of bit 9 in the L1 page tables.
> 
> I don't think there's any sensible short description which would
> adequately describe this setting which would satisfy both your situation
> and situations on other SoCs.  We could make the kernel print an entire
> paragraph on it, something like:

It is not my situation and even not my two use cases.
I just want to make sure that if any "user" just use this without knowing
what it means that we will get that message back.
I am not saying it is good or bad. Just saying that there is a way how
to get it back. And the purpose of this second email was just check
that we can't detect that. That's it - nothing more nothing less.

> 
> "ECC might be %sabled.  The exact ECC setting depends on how your SoC
> is implemented.  Please refer to your SoCs technical reference manual
> for a description of bit 9 in the level one page tables for further
> information on how to interpret this statement."
> 
> but that would be idiotic.

I agree with you and none is asking for this.


> Of course, we could just print nothing, but the purpose of printing this
> is so that _we_ as developers looking at the kernel messages know the
> status of this bit, particularly when interpreting oops dumps.  Hiding
> this information would make some oops dumps harder to diagnose.  So...
> this is a matter for user education if your users are complaining about
> it.

I have no problem with that. I just wanted to check that there is no way
how we can detect that. Then your proposed fix is completely fine to me.

Thanks,
Michal
diff mbox

Patch

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b1d17ee..1b88ce3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -556,8 +556,9 @@  static void __init build_mem_type_table(void)
 		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
 		break;
 	}
-	printk("Memory policy: ECC %sabled, Data cache %s\n",
-		ecc_mask ? "en" : "dis", cp->policy);
+	if (ecc_mask)
+		pr_info("Memory policy: ECC enabled, Data cache %s\n",
+			cp->policy);

 	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
 		struct mem_type *t = &mem_types[i];