diff mbox

[GIT,PULL] parisc huge page support for v4.4

Message ID 20151124170020.GA13198@p100.box (mailing list archive)
State Superseded
Headers show

Commit Message

Helge Deller Nov. 24, 2015, 5 p.m. UTC
* Mikulas Patocka <mpatocka@redhat.com>:
> 
> 
> On Tue, 24 Nov 2015, Helge Deller wrote:
> 
> > > Hi
> > > 
> > > Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC. 
> > > When I revert this patchset, the crashes are gone.
> > 
> > > [    3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
> > 
> > Hi Mikulas,
> > 
> > Yes, I've seen this as well.
> > It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
> > I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
> > I'm still investigating why it only affects the PA8900 CPUs, but I assume
> > it's related to the cache aliasing of those CPUs.
> > I'll keep you updated.
> > 
> > Helge
> 
> The PA-RISC specification doesn't allow aliasing on non-equaivalent 
> addresses. Can the kernel map a piece of kernel data to other virtual 
> address? If yes, we can't use big pages to map kernel data.

Can you please try the two patches below?
The first one disables mapping kernel text/data on huge pages on
PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.

Independend of my huge page patch the second patch disables the tlb
flush optimization we added earlier. It seems calling flush_tlb_all()
doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
the loop implementation.

Helge


--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Mikulas Patocka Nov. 24, 2015, 6:43 p.m. UTC | #1
On Tue, 24 Nov 2015, Helge Deller wrote:

> * Mikulas Patocka <mpatocka@redhat.com>:
> > 
> > 
> > On Tue, 24 Nov 2015, Helge Deller wrote:
> > 
> > > > Hi
> > > > 
> > > > Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC. 
> > > > When I revert this patchset, the crashes are gone.
> > > 
> > > > [    3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
> > > 
> > > Hi Mikulas,
> > > 
> > > Yes, I've seen this as well.
> > > It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
> > > I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
> > > I'm still investigating why it only affects the PA8900 CPUs, but I assume
> > > it's related to the cache aliasing of those CPUs.
> > > I'll keep you updated.
> > > 
> > > Helge
> > 
> > The PA-RISC specification doesn't allow aliasing on non-equaivalent 
> > addresses. Can the kernel map a piece of kernel data to other virtual 
> > address? If yes, we can't use big pages to map kernel data.
> 
> Can you please try the two patches below?
> The first one disables mapping kernel text/data on huge pages on
> PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.
> 
> Independend of my huge page patch the second patch disables the tlb
> flush optimization we added earlier. It seems calling flush_tlb_all()
> doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
> the loop implementation.
> 
> Helge

The kernel with these patches works fine so far.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mikulas Patocka Dec. 26, 2015, 12:09 p.m. UTC | #2
On Tue, 24 Nov 2015, Mikulas Patocka wrote:

> 
> 
> On Tue, 24 Nov 2015, Helge Deller wrote:
> 
> > * Mikulas Patocka <mpatocka@redhat.com>:
> > > 
> > > 
> > > On Tue, 24 Nov 2015, Helge Deller wrote:
> > > 
> > > > > Hi
> > > > > 
> > > > > Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC. 
> > > > > When I revert this patchset, the crashes are gone.
> > > > 
> > > > > [    3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
> > > > 
> > > > Hi Mikulas,
> > > > 
> > > > Yes, I've seen this as well.
> > > > It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
> > > > I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
> > > > I'm still investigating why it only affects the PA8900 CPUs, but I assume
> > > > it's related to the cache aliasing of those CPUs.
> > > > I'll keep you updated.
> > > > 
> > > > Helge
> > > 
> > > The PA-RISC specification doesn't allow aliasing on non-equaivalent 
> > > addresses. Can the kernel map a piece of kernel data to other virtual 
> > > address? If yes, we can't use big pages to map kernel data.
> > 
> > Can you please try the two patches below?
> > The first one disables mapping kernel text/data on huge pages on
> > PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.
> > 
> > Independend of my huge page patch the second patch disables the tlb
> > flush optimization we added earlier. It seems calling flush_tlb_all()
> > doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
> > the loop implementation.
> > 
> > Helge
> 
> The kernel with these patches works fine so far.
> 
> Mikulas

BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at 
"*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic 
instructions to modify page table entries, so it takes spinlock in the TLB 
handler and modifies the page table entry non-atomically. If you modify 
the page table entry without the spinlock, you may race with TLB handler 
on another CPU and your modification may be lost.

The comment says something about double locking on pa_tlb_lock, but 
pa_tlb_lock isn't held when that function is called.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Helge Deller Dec. 26, 2015, 12:31 p.m. UTC | #3
On 26.12.2015 13:09, Mikulas Patocka wrote:
>> On Tue, 24 Nov 2015, Helge Deller wrote:
>>> * Mikulas Patocka <mpatocka@redhat.com>:
>>>> On Tue, 24 Nov 2015, Helge Deller wrote:
>>>>>> Hi
>>>>>>
>>>>>> Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC. 
>>>>>> When I revert this patchset, the crashes are gone.
>>>>>
>>>>>> [    3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
>>>>>
>>>>> Hi Mikulas,
>>>>>
>>>>> Yes, I've seen this as well.
>>>>> It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
>>>>> I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
>>>>> I'm still investigating why it only affects the PA8900 CPUs, but I assume
>>>>> it's related to the cache aliasing of those CPUs.
>>>>> I'll keep you updated.
>>>>>
>>>>> Helge
>>>>
>>>> The PA-RISC specification doesn't allow aliasing on non-equaivalent 
>>>> addresses. Can the kernel map a piece of kernel data to other virtual 
>>>> address? If yes, we can't use big pages to map kernel data.
>>>
>>> Can you please try the two patches below?
>>> The first one disables mapping kernel text/data on huge pages on
>>> PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.
>>>
>>> Independend of my huge page patch the second patch disables the tlb
>>> flush optimization we added earlier. It seems calling flush_tlb_all()
>>> doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
>>> the loop implementation.
>>>
>>> Helge
>>
>> The kernel with these patches works fine so far.
>>
>> Mikulas
> 
> BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at 
> "*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic 
> instructions to modify page table entries, so it takes spinlock in the TLB 
> handler and modifies the page table entry non-atomically. If you modify 
> the page table entry without the spinlock, you may race with TLB handler 
> on another CPU and your modification may be lost.

Right.

> The comment says something about double locking on pa_tlb_lock, but 
> pa_tlb_lock isn't held when that function is called.

I have a work-in-progress patch for that in one of my trees, e.g.:
http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-next&id=5c76b525cbdb097401f46522b27b1eb6244f34f9
It's lightly tested though.

Helge

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mikulas Patocka Jan. 4, 2016, 9:24 p.m. UTC | #4
On Sat, 26 Dec 2015, Helge Deller wrote:

> On 26.12.2015 13:09, Mikulas Patocka wrote:
> 
> > BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at 
> > "*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic 
> > instructions to modify page table entries, so it takes spinlock in the TLB 
> > handler and modifies the page table entry non-atomically. If you modify 
> > the page table entry without the spinlock, you may race with TLB handler 
> > on another CPU and your modification may be lost.
> 
> Right.
> 
> > The comment says something about double locking on pa_tlb_lock, but 
> > pa_tlb_lock isn't held when that function is called.
> 
> I have a work-in-progress patch for that in one of my trees, e.g.:
> http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-next&id=5c76b525cbdb097401f46522b27b1eb6244f34f9
> It's lightly tested though.
> 
> Helge

I tested the patch and it works OK for me so far.

BTW. what happens if some kernel code takes the TLB spinlock and then TLB 
miss in kernel space happens? (it would attempt to lock the spinlock 
recursively) Is it assumed that the TLB is big enough that this can't 
happen?

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin Jan. 4, 2016, 9:48 p.m. UTC | #5
On 2016-01-04 4:24 PM, Mikulas Patocka wrote:
>
> On Sat, 26 Dec 2015, Helge Deller wrote:
>
>> On 26.12.2015 13:09, Mikulas Patocka wrote:
>>
>>> BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at
>>> "*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic
>>> instructions to modify page table entries, so it takes spinlock in the TLB
>>> handler and modifies the page table entry non-atomically. If you modify
>>> the page table entry without the spinlock, you may race with TLB handler
>>> on another CPU and your modification may be lost.
>> Right.
>>
>>> The comment says something about double locking on pa_tlb_lock, but
>>> pa_tlb_lock isn't held when that function is called.
>> I have a work-in-progress patch for that in one of my trees, e.g.:
>> http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-next&id=5c76b525cbdb097401f46522b27b1eb6244f34f9
>> It's lightly tested though.
>>
>> Helge
> I tested the patch and it works OK for me so far.
>
> BTW. what happens if some kernel code takes the TLB spinlock and then TLB
> miss in kernel space happens? (it would attempt to lock the spinlock
> recursively) Is it assumed that the TLB is big enough that this can't
> happen?
No.  If you look at the TLB handler, you will see that locking is not 
done for misses in
kernel space.  So, this deadlock doesn't occur.

Dave
diff mbox

Patch

diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index 1b366c4..958b7f36 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -475,19 +475,22 @@  static void __init map_pages(unsigned long start_vaddr,
 					pte =  __mk_pte(address, pgprot);
 				else if (parisc_text_address(vaddr)) {
 					pte = __mk_pte(address, PAGE_KERNEL_EXEC);
-					if (address >= ro_start && address < kernel_end)
+					if (address >= ro_start && address < kernel_end
+					    && !parisc_requires_coherency())
 						pte = pte_mkhuge(pte);
 				}
 				else
 #if defined(CONFIG_PARISC_PAGE_SIZE_4KB)
 				if (address >= ro_start && address < ro_end) {
 					pte = __mk_pte(address, PAGE_KERNEL_EXEC);
-					pte = pte_mkhuge(pte);
+					if (!parisc_requires_coherency())
+						pte = pte_mkhuge(pte);
 				} else
 #endif
 				{
 					pte = __mk_pte(address, pgprot);
-					if (address >= ro_start && address < kernel_end)
+					if (address >= ro_start && address < kernel_end
+					    && !parisc_requires_coherency())
 						pte = pte_mkhuge(pte);
 				}
 
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index cda6dbb..80ced95 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -445,7 +445,7 @@  int __flush_tlb_range(unsigned long sid, unsigned long start,
 	unsigned long flags, size;
 
 	size = (end - start);
-	if (size >= parisc_tlb_flush_threshold) {
+	if (0 && size >= parisc_tlb_flush_threshold) {
 		flush_tlb_all();
 		return 1;
 	}