Message ID | 1360745925-20952-1-git-send-email-michal.simek@xilinx.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Feb 13, 2013 at 09:58:44AM +0100, Michal Simek wrote: > Arm kernel with HIGHMEM ON will fail when dts memory node is described > by memory banks and the starting address is not inside the first 16MB > of the first memory bank. If HIGHMEM is OFF and the configuration is the same > then the first memory bank is ignored and the kernel boots. > > Here is the example of behavior: > dts memory reg = <0x0 0x10000000 0x10000000 0x30000000> > kernel load address = 0x10000000 (respectively 0x1000800) > > Current: > "Machine: Xilinx Zynq Platform, model: Zynq ZC702 Development Board > bootconsole [earlycon0] enabled > Memory policy: ECC disabled, Data cache writeback > Kernel panic - not syncing: ERROR: Failed to allocate 0x1000 bytes below 0x0." > > After: > The kernel ignore ram 0x0-0x0fffffff because is lower than the kernel starting > address and the kernel bootlog contains. > "Ignoring RAM at 00000000-0fffffff (CONFIG_HIGHMEM)." > > Also using mem=768M on the command line will overwrite dts memory > map and kernel will boot with HIGHMEM ON too. > > Signed-off-by: Michal Simek <michal.simek@xilinx.com> > --- > arch/arm/mm/mmu.c | 9 +++++++++ > 1 files changed, 9 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c > index ce328c7..e60f370 100644 > --- a/arch/arm/mm/mmu.c > +++ b/arch/arm/mm/mmu.c > @@ -937,6 +937,15 @@ void __init sanity_check_meminfo(void) > highmem = 1; > > #ifdef CONFIG_HIGHMEM > + if (__va(bank->start + bank->size - 1) < (void *)PAGE_OFFSET) { > + pr_notice("Ignoring RAM at %.8llx-%.8llx ", > + (unsigned long long)bank->start, > + (unsigned long long)bank->start > + + bank->size - 1); > + pr_cont("(CONFIG_HIGHMEM).\n"); > + continue; > + } NAK. __va() is only valid for memory within the kernel's direct mapping (iow, lowmem). Physical addresses outside of that range are free to return any other value. So, it's entirely possible that a valid highmem setup will have memory in highmem where __va(start + size - 1) _is_ below PAGE_OFFSET, and this memory should _not_ be discarded.
2013/2/13 Russell King - ARM Linux <linux@arm.linux.org.uk>: > On Wed, Feb 13, 2013 at 09:58:44AM +0100, Michal Simek wrote: >> Arm kernel with HIGHMEM ON will fail when dts memory node is described >> by memory banks and the starting address is not inside the first 16MB >> of the first memory bank. If HIGHMEM is OFF and the configuration is the same >> then the first memory bank is ignored and the kernel boots. >> >> Here is the example of behavior: >> dts memory reg = <0x0 0x10000000 0x10000000 0x30000000> >> kernel load address = 0x10000000 (respectively 0x1000800) >> >> Current: >> "Machine: Xilinx Zynq Platform, model: Zynq ZC702 Development Board >> bootconsole [earlycon0] enabled >> Memory policy: ECC disabled, Data cache writeback >> Kernel panic - not syncing: ERROR: Failed to allocate 0x1000 bytes below 0x0." >> >> After: >> The kernel ignore ram 0x0-0x0fffffff because is lower than the kernel starting >> address and the kernel bootlog contains. >> "Ignoring RAM at 00000000-0fffffff (CONFIG_HIGHMEM)." >> >> Also using mem=768M on the command line will overwrite dts memory >> map and kernel will boot with HIGHMEM ON too. >> >> Signed-off-by: Michal Simek <michal.simek@xilinx.com> >> --- >> arch/arm/mm/mmu.c | 9 +++++++++ >> 1 files changed, 9 insertions(+), 0 deletions(-) >> >> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c >> index ce328c7..e60f370 100644 >> --- a/arch/arm/mm/mmu.c >> +++ b/arch/arm/mm/mmu.c >> @@ -937,6 +937,15 @@ void __init sanity_check_meminfo(void) >> highmem = 1; >> >> #ifdef CONFIG_HIGHMEM >> + if (__va(bank->start + bank->size - 1) < (void *)PAGE_OFFSET) { >> + pr_notice("Ignoring RAM at %.8llx-%.8llx ", >> + (unsigned long long)bank->start, >> + (unsigned long long)bank->start >> + + bank->size - 1); >> + pr_cont("(CONFIG_HIGHMEM).\n"); >> + continue; >> + } > > NAK. __va() is only valid for memory within the kernel's direct mapping > (iow, lowmem). Physical addresses outside of that range are free to > return any other value. > > So, it's entirely possible that a valid highmem setup will have memory > in highmem where __va(start + size - 1) _is_ below PAGE_OFFSET, and this > memory should _not_ be discarded. ok. Can you describe me this configuration? Enough to tell me dts memory fragment and kernel load addr which match this case. My configuration is above and it doesn't work that's why I believe you have tested this case and it will be easy for me to retest it. Maybe we have any problem with zynq that's why I would like to be aware of it. What about "arm: mm: Add support for booting the kernel out of the first 16MB of memory"? Thanks, Michal
On Wed, Feb 13, 2013 at 03:39:21PM +0100, Michal Simek wrote: > ok. Can you describe me this configuration? Enough to tell me dts memory > fragment and kernel load addr which match this case. Anything which uses more than 1GB of memory and has PAGE_OFFSET set at 0xc0000000 (3GB). Simple maths will tell you that, and why it fails. Look: - if __va(bank->start) = PAGE_OFFSET, and PAGE_OFFSET is at 3GB. - if bank->size = 1GB (which we _do_ have), then __va(bank->start + bank->size) = 4GB. 4GB represented as a 32-bit pointer is NULL. NULL < (void *)PAGE_OFFSET. Therefore, your patch will cause systems with 1GB or more of memory in one bank to ignore _all_ the memory passed in. And if you look at the code: if (__va(bank->start) >= vmalloc_min || __va(bank->start) < (void *)PAGE_OFFSET) highmem = 1; notice the facy that we're marking all memory starting with an apparant virtual address outside of PAGE_OFFSET...vmalloc_min as highmem. This is to catch cases exactly like this. Memory for which __va(bank->start) < (void *)PAGE_OFFSET will also have __va(bank->start + bank->size - 1) (in your patch) also below PAGE_OFFSET, and your modification will cause the kernel to ignore this memory - which is not acceptable. I don't think there's much option for solutions to this; not with a common kernel designed to run on multiple platforms. If a platform doesn't conform to the Linux requirements for a common kernel, then it doesn't conform and it can't use it. In much the same way that we ended up saying "no" to people who wanted to place two physical banks of memory in reverse order in the virtual mapping, I think this is another case of "no, we can't permit this with common cross-subarch kernels".
2013/2/13 Russell King - ARM Linux <linux@arm.linux.org.uk>: > On Wed, Feb 13, 2013 at 03:39:21PM +0100, Michal Simek wrote: >> ok. Can you describe me this configuration? Enough to tell me dts memory >> fragment and kernel load addr which match this case. > > Anything which uses more than 1GB of memory and has PAGE_OFFSET set at > 0xc0000000 (3GB). Simple maths will tell you that, and why it fails. > > Look: > - if __va(bank->start) = PAGE_OFFSET, and PAGE_OFFSET is at 3GB. > - if bank->size = 1GB (which we _do_ have), then > __va(bank->start + bank->size) = 4GB. 4GB represented as a 32-bit > pointer is NULL. NULL < (void *)PAGE_OFFSET. > > Therefore, your patch will cause systems with 1GB or more of memory in > one bank to ignore _all_ the memory passed in. > > And if you look at the code: > > if (__va(bank->start) >= vmalloc_min || > __va(bank->start) < (void *)PAGE_OFFSET) > highmem = 1; > > notice the facy that we're marking all memory starting with an apparant > virtual address outside of PAGE_OFFSET...vmalloc_min as highmem. This is > to catch cases exactly like this. > > Memory for which __va(bank->start) < (void *)PAGE_OFFSET will also have > __va(bank->start + bank->size - 1) (in your patch) also below PAGE_OFFSET, > and your modification will cause the kernel to ignore this memory - > which is not acceptable. ok - that's fair. > I don't think there's much option for solutions to this; not with a common > kernel designed to run on multiple platforms. If a platform doesn't > conform to the Linux requirements for a common kernel, then it doesn't > conform and it can't use it. I have no problem to admit that this patch is wrong in this implementation but I tend to disagree with this part. The Linux kernel has some requirements, limitations, etc, that's all truth. Placing the kernel to the main memory to any location you want which end up with kernel panic is from my point of view fault. Saying that the kernel is always loaded withing the 16MB of memory is limitation but it doesn't mean that we shouldn't remove it. Ignoring the memory before the kernel can be one solution which I would like to discuss. What about the second patch used for !HIGHMEM case? > In much the same way that we ended up saying "no" to people who wanted > to place two physical banks of memory in reverse order in the virtual > mapping, I think this is another case of "no, we can't permit this with > common cross-subarch kernels". This case should probably end up on device-tree rule (+dtc fault). Not sure if memory banks should be ordered or not. Rob and Grant any thought? Thanks, Michal
On Wed, Feb 13, 2013 at 04:52:50PM +0100, Michal Simek wrote: > This case should probably end up on device-tree rule (+dtc fault). > Not sure if memory banks should be ordered or not. Rob and Grant any thought? They _are_ ordered by memblock, because memblock always arranges the banks in ascending physical address order - and it will merge contiguous banks together irrespective of what order they appear as. So you can't do something like bank 0: 0x60000000-0x70000000, bank 1: 0x50000000-0x60000000. You'll end up with a single memblock entry for 0x50000000-0x70000000.
On Wed, 13 Feb 2013, Michal Simek wrote: > The Linux kernel has some requirements, limitations, etc, that's all truth. > > Placing the kernel to the main memory to any location you want which end > up with kernel panic is from my point of view fault. I may agree with that, up to a point. > Saying that the kernel is always loaded withing the 16MB of memory is limitation > but it doesn't mean that we shouldn't remove it. The actual limitation should be 128MB for zImage. It will relocate the final kernel binary within the first 16MB by itself. > Ignoring the memory before the kernel can be one solution which I > would like to discuss. > What about the second patch used for !HIGHMEM case? Highmem is irrelevant. If you want to ignore memory that way, you must look at the _physical_ addresses for each bank, and simply truncate anything that is below PHYS_OFFSET. The comparison should be trivial with physical addresses as they're not limited to 32 bits. Nicolas
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index ce328c7..e60f370 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -937,6 +937,15 @@ void __init sanity_check_meminfo(void) highmem = 1; #ifdef CONFIG_HIGHMEM + if (__va(bank->start + bank->size - 1) < (void *)PAGE_OFFSET) { + pr_notice("Ignoring RAM at %.8llx-%.8llx ", + (unsigned long long)bank->start, + (unsigned long long)bank->start + + bank->size - 1); + pr_cont("(CONFIG_HIGHMEM).\n"); + continue; + } + if (__va(bank->start) >= vmalloc_min || __va(bank->start) < (void *)PAGE_OFFSET) highmem = 1;
Arm kernel with HIGHMEM ON will fail when dts memory node is described by memory banks and the starting address is not inside the first 16MB of the first memory bank. If HIGHMEM is OFF and the configuration is the same then the first memory bank is ignored and the kernel boots. Here is the example of behavior: dts memory reg = <0x0 0x10000000 0x10000000 0x30000000> kernel load address = 0x10000000 (respectively 0x1000800) Current: "Machine: Xilinx Zynq Platform, model: Zynq ZC702 Development Board bootconsole [earlycon0] enabled Memory policy: ECC disabled, Data cache writeback Kernel panic - not syncing: ERROR: Failed to allocate 0x1000 bytes below 0x0." After: The kernel ignore ram 0x0-0x0fffffff because is lower than the kernel starting address and the kernel bootlog contains. "Ignoring RAM at 00000000-0fffffff (CONFIG_HIGHMEM)." Also using mem=768M on the command line will overwrite dts memory map and kernel will boot with HIGHMEM ON too. Signed-off-by: Michal Simek <michal.simek@xilinx.com> --- arch/arm/mm/mmu.c | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-)