diff mbox

[2/3] ARM: kexec: copying code to ioremapped area

Message ID 1390389916-8711-3-git-send-email-wangnan0@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wang Nan Jan. 22, 2014, 11:25 a.m. UTC
ARM's kdump is actually corrupted (at least for omap4460), mainly because of
cache problem: flush_icache_range can't reliably ensure the copied data
correctly goes into RAM. After mmu turned off and jump to the trampoline, kexec
always failed due to random undef instructions.

This patch use ioremap to make sure the destnation of all memcpy() is
uncachable memory, including copying of target kernel and trampoline.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: <stable@vger.kernel.org> # 3.4+
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Geng Hui <hui.geng@huawei.com>
---
 arch/arm/kernel/machine_kexec.c | 18 ++++++++++++++++--
 kernel/kexec.c                  | 40 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 51 insertions(+), 7 deletions(-)

Comments

Wang Nan Jan. 22, 2014, 1:03 p.m. UTC | #1
On 2014/1/22 20:56, Vaibhav Bedia wrote:
> On Wed, Jan 22, 2014 at 6:25 AM, Wang Nan <wangnan0@huawei.com <mailto:wangnan0@huawei.com>> wrote:
> 
>     ARM's kdump is actually corrupted (at least for omap4460), mainly because of
>     cache problem: flush_icache_range can't reliably ensure the copied data
>     correctly goes into RAM. After mmu turned off and jump to the trampoline, kexec
>     always failed due to random undef instructions.
> 
>     This patch use ioremap to make sure the destnation of all memcpy() is
>     uncachable memory, including copying of target kernel and trampoline.
> 
> 
> AFAIK ioremap on RAM in forbidden in ARM and device memory that ioremap()
> ends up creating is not meant for executable code.
> 
> Doesn't this trigger the WARN_ON() in _arm_ioremap_pfn_caller)?

This patch is depend on the previous one:

ARM: Premit ioremap() to map reserved pages

However, Russell is opposed to it.
Russell King - ARM Linux Jan. 22, 2014, 1:27 p.m. UTC | #2
On Wed, Jan 22, 2014 at 07:25:15PM +0800, Wang Nan wrote:
> ARM's kdump is actually corrupted (at least for omap4460), mainly because of
> cache problem: flush_icache_range can't reliably ensure the copied data
> correctly goes into RAM.

Quite right too.  You're mistake here is thinking that flush_icache_range()
should push it to RAM.  That's incorrect.

flush_icache_range() is there to deal with such things as loadable modules
and self modifying code, where the MMU is not being turned off.  Hence, it
only flushes to the point of coherency between the I and D caches, and
any further levels of cache between that point and memory are not touched.
Why should it touch any more levels - it's not the function's purpose.

> After mmu turned off and jump to the trampoline, kexec always failed due
> to random undef instructions.

We already have code in the kernel which deals with shutting the MMU off.
An instance of how this can be done is illustrated in the soft_restart()
code path, and kexec already uses this.

One of the first things soft_restart() does is turn off the outer cache -
which OMAP4 does have, but this can only be done if there is a single CPU
running.  If there's multiple CPUs running, then the outer cache can't be
disabled, and that's the most likely cause of the problem you're seeing.
Wang Nan Jan. 23, 2014, 2:16 a.m. UTC | #3
On 2014/1/22 21:27, Russell King - ARM Linux wrote:
> On Wed, Jan 22, 2014 at 07:25:15PM +0800, Wang Nan wrote:
>> ARM's kdump is actually corrupted (at least for omap4460), mainly because of
>> cache problem: flush_icache_range can't reliably ensure the copied data
>> correctly goes into RAM.
> 
> Quite right too.  You're mistake here is thinking that flush_icache_range()
> should push it to RAM.  That's incorrect.
> 
> flush_icache_range() is there to deal with such things as loadable modules
> and self modifying code, where the MMU is not being turned off.  Hence, it
> only flushes to the point of coherency between the I and D caches, and
> any further levels of cache between that point and memory are not touched.
> Why should it touch any more levels - it's not the function's purpose.
> 
>> After mmu turned off and jump to the trampoline, kexec always failed due
>> to random undef instructions.
> 
> We already have code in the kernel which deals with shutting the MMU off.
> An instance of how this can be done is illustrated in the soft_restart()
> code path, and kexec already uses this.
> 
> One of the first things soft_restart() does is turn off the outer cache -
> which OMAP4 does have, but this can only be done if there is a single CPU
> running.  If there's multiple CPUs running, then the outer cache can't be
> disabled, and that's the most likely cause of the problem you're seeing.
> 

You are right, commit b25f3e1c (OMAP4/highbank: Flush L2 cache before disabling)
solves my problem, it flushes outer cache before disabling. I have tested it in
UP and SMP situations and it works (actually, omap4 has not ready to support kexec
in SMP case, I insert an empty cpu_kill() to make it work), so the first 2
patches are unneeded.

What about the 3rd one (ARM: allow kernel to be loaded in middle of phymem)?
diff mbox

Patch

diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index f0d180d..ba0a5a8 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -144,6 +144,7 @@  void machine_kexec(struct kimage *image)
 	unsigned long page_list;
 	unsigned long reboot_code_buffer_phys;
 	unsigned long reboot_entry = (unsigned long)relocate_new_kernel;
+	void __iomem *reboot_entry_remap;
 	unsigned long reboot_entry_phys;
 	void *reboot_code_buffer;
 
@@ -171,9 +172,22 @@  void machine_kexec(struct kimage *image)
 
 
 	/* copy our kernel relocation code to the control code page */
-	reboot_entry = fncpy(reboot_code_buffer,
-			     reboot_entry,
+	reboot_entry_remap = ioremap_nocache(reboot_code_buffer_phys,
+					     relocate_new_kernel_size);
+	if (reboot_entry_remap == NULL) {
+		pr_warn("startup code may not be reliably flushed\n");
+		reboot_entry_remap = (void __iomem *)reboot_code_buffer;
+	}
+
+	reboot_entry = fncpy(reboot_entry_remap, reboot_entry,
 			     relocate_new_kernel_size);
+	reboot_entry = (unsigned long)reboot_code_buffer +
+			(reboot_entry -
+			 (unsigned long)reboot_entry_remap);
+
+	if (reboot_entry_remap != reboot_code_buffer)
+		iounmap(reboot_entry_remap);
+
 	reboot_entry_phys = (unsigned long)reboot_entry +
 		(reboot_code_buffer_phys - (unsigned long)reboot_code_buffer);
 
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 9c97016..3e92999 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -806,6 +806,7 @@  static int kimage_load_normal_segment(struct kimage *image,
 	while (mbytes) {
 		struct page *page;
 		char *ptr;
+		void __iomem *ioptr;
 		size_t uchunk, mchunk;
 
 		page = kimage_alloc_page(image, GFP_HIGHUSER, maddr);
@@ -818,7 +819,17 @@  static int kimage_load_normal_segment(struct kimage *image,
 		if (result < 0)
 			goto out;
 
-		ptr = kmap(page);
+		/*
+		 * Try ioremap to make sure the copied data goes into RAM
+		 * reliably. If failed (some archs don't allow ioremap RAM),
+		 * use kmap instead.
+		 */
+		ioptr = ioremap(page_to_pfn(page) << PAGE_SHIFT,
+				PAGE_SIZE);
+		if (ioptr != NULL)
+			ptr = ioptr;
+		else
+			ptr = kmap(page);
 		/* Start with a clear page */
 		clear_page(ptr);
 		ptr += maddr & ~PAGE_MASK;
@@ -827,7 +838,10 @@  static int kimage_load_normal_segment(struct kimage *image,
 		uchunk = min(ubytes, mchunk);
 
 		result = copy_from_user(ptr, buf, uchunk);
-		kunmap(page);
+		if (ioptr != NULL)
+			iounmap(ioptr);
+		else
+			kunmap(page);
 		if (result) {
 			result = -EFAULT;
 			goto out;
@@ -846,7 +860,7 @@  static int kimage_load_crash_segment(struct kimage *image,
 {
 	/* For crash dumps kernels we simply copy the data from
 	 * user space to it's destination.
-	 * We do things a page at a time for the sake of kmap.
+	 * We do things a page at a time for the sake of ioremap/kmap.
 	 */
 	unsigned long maddr;
 	size_t ubytes, mbytes;
@@ -861,6 +875,7 @@  static int kimage_load_crash_segment(struct kimage *image,
 	while (mbytes) {
 		struct page *page;
 		char *ptr;
+		void __iomem *ioptr;
 		size_t uchunk, mchunk;
 
 		page = pfn_to_page(maddr >> PAGE_SHIFT);
@@ -868,7 +883,18 @@  static int kimage_load_crash_segment(struct kimage *image,
 			result  = -ENOMEM;
 			goto out;
 		}
-		ptr = kmap(page);
+		/*
+		 * Try ioremap to make sure the copied data goes into RAM
+		 * reliably. If failed (some archs don't allow ioremap RAM),
+		 * use kmap instead.
+		 */
+		ioptr = ioremap_nocache(page_to_pfn(page) << PAGE_SHIFT,
+				        PAGE_SIZE);
+		if (ioptr != NULL)
+			ptr = ioptr;
+		else
+			ptr = kmap(page);
+
 		ptr += maddr & ~PAGE_MASK;
 		mchunk = min_t(size_t, mbytes,
 				PAGE_SIZE - (maddr & ~PAGE_MASK));
@@ -879,7 +905,11 @@  static int kimage_load_crash_segment(struct kimage *image,
 		}
 		result = copy_from_user(ptr, buf, uchunk);
 		kexec_flush_icache_page(page);
-		kunmap(page);
+		if (ioptr != NULL)
+			iounmap(ioptr);
+		else
+			kunmap(page);
+
 		if (result) {
 			result = -EFAULT;
 			goto out;