Message ID | 1564591443.3319.30.camel@HansenPartnership.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Compressed kernels currently won't boot | expand |
Hi, On Wed, Jul 31, 2019 at 09:44:03AM -0700, James Bottomley wrote: > I noticed this trying to test out compressed kernel booting. The > problem is that a compressed kernel is divided into two pieces, one of > which starts at 0x000e0000 and is the bootstrap code which is > uncompressed into 0x00100000 and the rest of which is the real > compressed kernel which is loaded above the end of the current > decompressed size of the entire kernel. palo decompresses the head and > jumps to it and it then decompresses the rest of the kernel into place. > This means that the first part of the compressed image can't be larger > than 0x20000 == 131072 because otherwise it will be loaded into an area > that decompression will alter. > > The problem is that a change was introduced by > > commit 34c201ae49fe9e0bf3b389da5869d810f201c740 > Author: Helge Deller <deller@gmx.de> > Date: Mon Oct 15 22:14:01 2018 +0200 Hmm. This is what i've been facing as well. After reading this commit i'm not sure that the patch i've just sent ("parisc: strip debug information when building compressed images") is really wanted. However, it is really a pain to always copy huge lifimages around when booting parisc machines via LAN. Does someone really extract the vmlinux file from a compressed kernel images? Should we keep that? Regards Sven
On Wed, 2019-07-31 at 19:30 +0200, Sven Schnelle wrote: > Hi, > > On Wed, Jul 31, 2019 at 09:44:03AM -0700, James Bottomley wrote: > > I noticed this trying to test out compressed kernel booting. The > > problem is that a compressed kernel is divided into two pieces, one > > of which starts at 0x000e0000 and is the bootstrap code which is > > uncompressed into 0x00100000 and the rest of which is the real > > compressed kernel which is loaded above the end of the current > > decompressed size of the entire kernel. palo decompresses the head > > and jumps to it and it then decompresses the rest of the kernel > > into place. This means that the first part of the compressed image > > can't be larger than 0x20000 == 131072 because otherwise it will be > > loaded into an area that decompression will alter. > > > > The problem is that a change was introduced by > > > > commit 34c201ae49fe9e0bf3b389da5869d810f201c740 > > Author: Helge Deller <deller@gmx.de> > > Date: Mon Oct 15 22:14:01 2018 +0200 > > Hmm. This is what i've been facing as well. Yes, except you're a more extreme case than me ... you actually have the compressed segment overlapping the end of the decompressed text. that does seem to mean we have a lot of no-load debug information which isn't useful to the compressed image. > After reading this commit i'm not sure that the patch i've just sent > ("parisc: strip debug information when building compressed images") > is really wanted. However, it is really a pain to always copy huge > lifimages around when booting parisc machines via LAN. Does someone > really extract the vmlinux file from a compressed kernel images? > Should we keep that? Well, it's a thing. There's a script in the kernel source tree scripts/extract-vmlinux that does it. It doesn't seem to be packaged by debian, though. James
On Wed, 2019-07-31 at 10:50 -0700, James Bottomley wrote: > On Wed, 2019-07-31 at 19:30 +0200, Sven Schnelle wrote: > > Hi, > > > > On Wed, Jul 31, 2019 at 09:44:03AM -0700, James Bottomley wrote: > > > I noticed this trying to test out compressed kernel booting. The > > > problem is that a compressed kernel is divided into two pieces, > > > one > > > of which starts at 0x000e0000 and is the bootstrap code which is > > > uncompressed into 0x00100000 and the rest of which is the real > > > compressed kernel which is loaded above the end of the current > > > decompressed size of the entire kernel. palo decompresses the > > > head > > > and jumps to it and it then decompresses the rest of the kernel > > > into place. This means that the first part of the compressed > > > image > > > can't be larger than 0x20000 == 131072 because otherwise it will > > > be > > > loaded into an area that decompression will alter. > > > > > > The problem is that a change was introduced by > > > > > > commit 34c201ae49fe9e0bf3b389da5869d810f201c740 > > > Author: Helge Deller <deller@gmx.de> > > > Date: Mon Oct 15 22:14:01 2018 +0200 > > > > Hmm. This is what i've been facing as well. > > Yes, except you're a more extreme case than me ... you actually have > the compressed segment overlapping the end of the decompressed text. > that does seem to mean we have a lot of no-load debug information > which > isn't useful to the compressed image. > > > After reading this commit i'm not sure that the patch i've just > > sent ("parisc: strip debug information when building compressed > > images") is really wanted. However, it is really a pain to always > > copy huge lifimages around when booting parisc machines via LAN. > > Does someone really extract the vmlinux file from a compressed > > kernel images? Should we keep that? > > Well, it's a thing. There's a script in the kernel source tree > > scripts/extract-vmlinux > > that does it. It doesn't seem to be packaged by debian, though. What about causing the compressed make to build both a stripped and a non-stripped bzImage (say sbzImage and bzImage). That way you always have the stripped one available for small size things like boot from tape or DVD? but in the usual case we use the bzImage with full contents. James
Hi James, On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: > What about causing the compressed make to build both a stripped and a > non-stripped bzImage (say sbzImage and bzImage). That way you always > have the stripped one available for small size things like boot from > tape or DVD? but in the usual case we use the bzImage with full > contents. In that case we would also need to build two lifimages - how about adding a config option option? Something like "Strip debug information from compressed kernel images"? Regards Sven
On 31.07.19 21:44, Sven Schnelle wrote: > Hi James, > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: > >> What about causing the compressed make to build both a stripped and a >> non-stripped bzImage (say sbzImage and bzImage). That way you always >> have the stripped one available for small size things like boot from >> tape or DVD? but in the usual case we use the bzImage with full >> contents. > > In that case we would also need to build two lifimages - how about adding > a config option option? Something like "Strip debug information from compressed > kernel images"? I agree, two lifimages don't make sense. Only one vmlinuz gets installed. Instead of the config option, I tink my latest patch is better. Helge
On Wed, 2019-07-31 at 21:46 +0200, Helge Deller wrote: > On 31.07.19 21:44, Sven Schnelle wrote: > > Hi James, > > > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: > > > > > What about causing the compressed make to build both a stripped > > > and a non-stripped bzImage (say sbzImage and bzImage). That way > > > you always have the stripped one available for small size things > > > like boot from tape or DVD? but in the usual case we use the > > > bzImage with full contents. > > > > In that case we would also need to build two lifimages - how about > > adding a config option option? Something like "Strip debug > > information from compressed kernel images"? > > I agree, two lifimages don't make sense. Only one vmlinuz gets > installed. Instead of the config option, I tink my latest patch is > better. It doesn't solve the problem that if a stripped compressed image is > 128kb then it overwrites the decompress area starting at 0x00100000 so we can't decompress the end because we've already overwritten it before the decompressor gets to it. What we could possibly do is be clever and align the .rodata.compressed so its last text byte ends where the uncompressed kernel text would end. We could be even more clever and split .rodata.compressed into a load and a noload part so we would only load the part of the compressed kernel we need. Then the lifimage creation scripts could discard the noload part containing the debug symbols. James
On 31.07.19 18:44, James Bottomley wrote: > I noticed this trying to test out compressed kernel booting. The > problem is that a compressed kernel is divided into two pieces, one of > which starts at 0x000e0000 and is the bootstrap code which is > uncompressed into 0x00100000 and the rest of which is the real > compressed kernel which is loaded above the end of the current > decompressed size of the entire kernel. palo decompresses the head and > jumps to it and it then decompresses the rest of the kernel into place. > This means that the first part of the compressed image can't be larger > than 0x20000 == 131072 because otherwise it will be loaded into an area > that decompression will alter. > > The problem is that a change was introduced by > > commit 34c201ae49fe9e0bf3b389da5869d810f201c740 > Author: Helge Deller <deller@gmx.de> > Date: Mon Oct 15 22:14:01 2018 +0200 > > parisc: Include compressed vmlinux file in vmlinuz boot kernel > > > Which moved the compressed vmlinux from the second segment to the > first, which is what makes it too big for me. This patch reverting > that piece allows me to boot again. There are two requirements: 1. Make sure not to use too much memory for "old" machines. Otherwise you won't be able to boot a compressed kernel on e.g. a 16MB machine. If you move the compressed data behind where the kernel would self-extract itself, you double the amount of memory required. I think with the patch below I won't be able to boot my 715/64 any longer. 2. Old palo versions had a bug which prevented the ELF loader to load sections above 16MB. So, one needs to keep everything thin in the low memory without extracting over oneself. 3. There might have been other reasons too, but currently I don't remember :-) I believe the the patch I sent for arch/parisc/boot/compressed/vmlinux.lds.S: + /* bootloader code and data starts at least behind area of extracted kernel */ + . = MAX(ABSOLUTE(.), (SZ_end - SZparisc_kernel_start + KERNEL_BINARY_TEXT_START)); keeps everything bootable (on low-memory-machines and with palo ELF bootloader bug). Helge > > diff --git a/arch/parisc/boot/compressed/vmlinux.lds.S b/arch/parisc/boot/compressed/vmlinux.lds.S > index bfd7872739a3..5841aa373c03 100644 > --- a/arch/parisc/boot/compressed/vmlinux.lds.S > +++ b/arch/parisc/boot/compressed/vmlinux.lds.S > @@ -42,12 +42,6 @@ SECTIONS > #endif > _startcode_end = .; > > - /* vmlinux.bin.gz is here */ > - . = ALIGN(8); > - .rodata.compressed : { > - *(.rodata.compressed) > - } > - > /* bootloader code and data starts behind area of extracted kernel */ > . = (SZ_end - SZparisc_kernel_start + KERNEL_BINARY_TEXT_START); > > @@ -73,6 +67,12 @@ SECTIONS > *(.rodata.*) > _erodata = . ; > } > + /* vmlinux.bin.gz is here */ > + . = ALIGN(8); > + .rodata.compressed : { > + *(.rodata.compressed) > + } > + > . = ALIGN(8); > .bss : { > _bss = . ; >
On 31.07.19 21:56, James Bottomley wrote: > On Wed, 2019-07-31 at 21:46 +0200, Helge Deller wrote: >> On 31.07.19 21:44, Sven Schnelle wrote: >>> Hi James, >>> >>> On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: >>> >>>> What about causing the compressed make to build both a stripped >>>> and a non-stripped bzImage (say sbzImage and bzImage). That way >>>> you always have the stripped one available for small size things >>>> like boot from tape or DVD? but in the usual case we use the >>>> bzImage with full contents. >>> >>> In that case we would also need to build two lifimages - how about >>> adding a config option option? Something like "Strip debug >>> information from compressed kernel images"? >> >> I agree, two lifimages don't make sense. Only one vmlinuz gets >> installed. Instead of the config option, I tink my latest patch is >> better. > > It doesn't solve the problem that if a stripped compressed image is > > 128kb then it overwrites the decompress area starting at 0x00100000 so > we can't decompress the end because we've already overwritten it before > the decompressor gets to it. I don't get this point. hppa64-linux-gnu-objdump -h vmlinuz shows: Sections: Idx Name Size VMA LMA File off Algn 0 .head.text 00000084 00000000000e0000 00000000000e0000 00001000 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .opd 00000340 00000000000e0090 00000000000e0090 00001090 2**3 CONTENTS, ALLOC, LOAD, DATA 2 .dlt 00000160 00000000000e03d0 00000000000e03d0 000013d0 2**3 CONTENTS, ALLOC, LOAD, DATA 3 .rodata.compressed 01f3c2b0 00000000000e0530 00000000000e0530 00001530 2**0 CONTENTS, ALLOC, LOAD, DATA 4 .text 00005cc0 000000000201d000 000000000201d000 01f3e000 2**7 CONTENTS, ALLOC, LOAD, READONLY, CODE 5 .data 00000060 0000000002022cc0 0000000002022cc0 01f43cc0 2**3 CONTENTS, ALLOC, LOAD, DATA Only .head.text gets loaded at e0000, and it is basically just a few bytes which sets-up registers and jump to .text segment (at 0201d000 in this case). See: arch/parisc/boot/compressed/head.S How should that get bigger than 128KB ? Then the code in .text decompresses the whole kernel image behind itself (behind "data"). Then the ELF loader moves the parts from the high-memory to the final destination (e.g. 1000000). The steps are: 1. palo loads vmlinuz into memory. 2. vmlinuz' head starts, and decompress_kernel() in arch/parisc/boot/compressed/misc.c decompresses the vmlinuz file to a vmlinux file and stores it to vmlinux_addr (which is behind the bss section of the boot decompressor). 3. Then the original kernel entry is started (arch/parisc/kernel/entry.S) which moves the code to where it belongs and starts the kernel. Helge > What we could possibly do is be clever and align the .rodata.compressed > so its last text byte ends where the uncompressed kernel text would > end. We could be even more clever and split .rodata.compressed into a > load and a noload part so we would only load the part of the compressed > kernel we need. Then the lifimage creation scripts could discard the > noload part containing the debug symbols. > > James >
On Wed, 2019-07-31 at 22:19 +0200, Helge Deller wrote: > On 31.07.19 21:56, James Bottomley wrote: > > On Wed, 2019-07-31 at 21:46 +0200, Helge Deller wrote: > > > On 31.07.19 21:44, Sven Schnelle wrote: > > > > Hi James, > > > > > > > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley > > > > wrote: > > > > > > > > > What about causing the compressed make to build both a > > > > > stripped and a non-stripped bzImage (say sbzImage and > > > > > bzImage). That way you always have the stripped one > > > > > available for small size things like boot from tape or > > > > > DVD? but in the usual case we use the bzImage with full > > > > > contents. > > > > > > > > In that case we would also need to build two lifimages - how > > > > about adding a config option option? Something like "Strip > > > > debug information from compressed kernel images"? > > > > > > I agree, two lifimages don't make sense. Only one vmlinuz gets > > > installed. Instead of the config option, I tink my latest patch > > > is better. > > > > It doesn't solve the problem that if a stripped compressed image is > > > > > 128kb then it overwrites the decompress area starting at 0x00100000 > > so we can't decompress the end because we've already overwritten it > > before the decompressor gets to it. > > I don't get this point. > hppa64-linux-gnu-objdump -h vmlinuz > shows: > Sections: > Idx Name Size VMA LMA File > off Algn > 0 > .head.text 00000084 00000000000e0000 00000000000e0000 00001000 > 2**2 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 1 > .opd 00000340 00000000000e0090 00000000000e0090 00001090 > 2**3 > CONTENTS, ALLOC, LOAD, DATA > 2 > .dlt 00000160 00000000000e03d0 00000000000e03d0 000013d0 > 2**3 > CONTENTS, ALLOC, LOAD, DATA > 3 .rodata.compressed > 01f3c2b0 00000000000e0530 00000000000e0530 00001530 2**0 > CONTENTS, ALLOC, LOAD, DATA > 4 > .text 00005cc0 000000000201d000 000000000201d000 01f3e000 > 2**7 > CONTENTS, ALLOC, LOAD, READONLY, CODE > 5 > .data 00000060 0000000002022cc0 0000000002022cc0 01f43cc0 > 2**3 > CONTENTS, ALLOC, LOAD, DATA > > Only .head.text gets loaded at e0000, and it is basically just a few > bytes which sets-up registers and jump to .text segment (at 0201d000 > in this case). Actually, you're looking at the wrong thing, you want to look at the program header (the segments) not the section header. It's the program header we load. If I extract this from the current debian kernel we get jejb@ion:~/git/linux-build/arch/parisc/boot/compressed> readelf -l /boot/vmlinuz-4.19.0-5-parisc64-smp Elf file type is EXEC (Executable file) Entry point 0xe0000 There are 4 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000001040 0x0000000000000000 0x00000000000000e0 0x00000000000000e0 R E 0x8 LOAD 0x0000000000001000 0x00000000000e0000 0x00000000000e0000 0x00000000000004d8 0x00000000000004d8 RWE 0x1000 LOAD 0x0000000000002000 0x0000000001400000 0x0000000001400000 0x00000000003dd46c 0x00000000003e1000 RWE 0x1000 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 0x10 Section to Segment mapping: Segment Sections... 00 01 .head.text .opd .dlt 02 .text .data .rodata .eh_frame .bss 03 The two LOAD sections corresponding to what PALO actually loads. The problem happens if the length of the first load section is bigger than 0x20000. Now if you look what happens after your change: jejb@ion:~/git/linux-build/build/parisc64/arch/parisc/boot> readelf -l bzImage Elf file type is EXEC (Executable file) Entry point 0xe0000 There are 4 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000001040 0x0000000000000000 0x00000000000000e0 0x00000000000000e0 R E 0x8 LOAD 0x0000000000001000 0x00000000000e0000 0x00000000000e0000 0x00000000004ae760 0x00000000004ae760 RWE 0x1000 LOAD 0x00000000004b0000 0x000000000118a000 0x000000000118a000 0x0000000000006044 0x000000000000a000 RWE 0x1000 GNU_STACK 0 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 0x10 Section to Segment mapping: Segment Sections... 00 01 .head.text .opd .dlt .rodata.compressed 02 .text .data .rodata .eh_frame .bss 03 So the first section tries to load between 0x000e0000-0x0058e760 and that's overwritten at 0x00100000 when the decompression starts because 0x00100000 is our KERNEL_BINARY_TEXT_START. The result for me is that I get the Decompressing linux ... message followed by a HPMC. James
On Wed, 2019-07-31 at 21:44 +0200, Sven Schnelle wrote: > Hi James, > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: > > > What about causing the compressed make to build both a stripped and > > a non-stripped bzImage (say sbzImage and bzImage). That way you > > always have the stripped one available for small size things like > > boot from tape or DVD? but in the usual case we use the bzImage > > with full contents. > > In that case we would also need to build two lifimages - how about > adding a config option option? Something like "Strip debug > information from compressed kernel images"? Actually, I just looked at what x86 does. It has this in the arch/x86/boot/compressed/Makefile: OBJCOPYFLAGS_vmlinux.bin := -R .comment -S $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) So it basically strips all the debug information from the kernel before compressing, which argues there's no need to retain the information because x86 doesn't bother. James
Hi, On Wed, Jul 31, 2019 at 02:01:34PM -0700, James Bottomley wrote: > On Wed, 2019-07-31 at 21:44 +0200, Sven Schnelle wrote: > > Hi James, > > > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: > > > > > What about causing the compressed make to build both a stripped and > > > a non-stripped bzImage (say sbzImage and bzImage). That way you > > > always have the stripped one available for small size things like > > > boot from tape or DVD? but in the usual case we use the bzImage > > > with full contents. > > > > In that case we would also need to build two lifimages - how about > > adding a config option option? Something like "Strip debug > > information from compressed kernel images"? > > Actually, I just looked at what x86 does. It has this in the > arch/x86/boot/compressed/Makefile: > > OBJCOPYFLAGS_vmlinux.bin := -R .comment -S > $(obj)/vmlinux.bin: vmlinux FORCE > $(call if_changed,objcopy) > > So it basically strips all the debug information from the kernel before > compressing, which argues there's no need to retain the information > because x86 doesn't bother. Nice. So we could convince Helge by saying "Look, x86 is also stripping it"! :-) Regards Sven
On 31.07.19 23:08, Sven Schnelle wrote: > Hi, > > On Wed, Jul 31, 2019 at 02:01:34PM -0700, James Bottomley wrote: >> On Wed, 2019-07-31 at 21:44 +0200, Sven Schnelle wrote: >>> Hi James, >>> >>> On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: >>> >>>> What about causing the compressed make to build both a stripped and >>>> a non-stripped bzImage (say sbzImage and bzImage). That way you >>>> always have the stripped one available for small size things like >>>> boot from tape or DVD? but in the usual case we use the bzImage >>>> with full contents. >>> >>> In that case we would also need to build two lifimages - how about >>> adding a config option option? Something like "Strip debug >>> information from compressed kernel images"? >> >> Actually, I just looked at what x86 does. It has this in the >> arch/x86/boot/compressed/Makefile: >> >> OBJCOPYFLAGS_vmlinux.bin := -R .comment -S >> $(obj)/vmlinux.bin: vmlinux FORCE >> $(call if_changed,objcopy) >> >> So it basically strips all the debug information from the kernel before >> compressing, which argues there's no need to retain the information >> because x86 doesn't bother. > > Nice. So we could convince Helge by saying "Look, x86 is also stripping it"! :-) I'm fine with doing exactly why x86 does :-) Helge
On 31.07.19 22:49, James Bottomley wrote: > On Wed, 2019-07-31 at 22:19 +0200, Helge Deller wrote: >> On 31.07.19 21:56, James Bottomley wrote: >>> On Wed, 2019-07-31 at 21:46 +0200, Helge Deller wrote: >>>> On 31.07.19 21:44, Sven Schnelle wrote: >>>>> Hi James, >>>>> >>>>> On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley >>>>> wrote: >>>>> >>>>>> What about causing the compressed make to build both a >>>>>> stripped and a non-stripped bzImage (say sbzImage and >>>>>> bzImage). That way you always have the stripped one >>>>>> available for small size things like boot from tape or >>>>>> DVD? but in the usual case we use the bzImage with full >>>>>> contents. >>>>> >>>>> In that case we would also need to build two lifimages - how >>>>> about adding a config option option? Something like "Strip >>>>> debug information from compressed kernel images"? >>>> >>>> I agree, two lifimages don't make sense. Only one vmlinuz gets >>>> installed. Instead of the config option, I tink my latest patch >>>> is better. >>> >>> It doesn't solve the problem that if a stripped compressed image is >>>> >>> 128kb then it overwrites the decompress area starting at 0x00100000 >>> so we can't decompress the end because we've already overwritten it >>> before the decompressor gets to it. >> >> I don't get this point. >> hppa64-linux-gnu-objdump -h vmlinuz >> shows: >> Sections: >> Idx Name Size VMA LMA File >> off Algn >> 0 >> .head.text 00000084 00000000000e0000 00000000000e0000 00001000 >> 2**2 >> CONTENTS, ALLOC, LOAD, READONLY, CODE >> 1 >> .opd 00000340 00000000000e0090 00000000000e0090 00001090 >> 2**3 >> CONTENTS, ALLOC, LOAD, DATA >> 2 >> .dlt 00000160 00000000000e03d0 00000000000e03d0 000013d0 >> 2**3 >> CONTENTS, ALLOC, LOAD, DATA >> 3 .rodata.compressed >> 01f3c2b0 00000000000e0530 00000000000e0530 00001530 2**0 >> CONTENTS, ALLOC, LOAD, DATA >> 4 >> .text 00005cc0 000000000201d000 000000000201d000 01f3e000 >> 2**7 >> CONTENTS, ALLOC, LOAD, READONLY, CODE >> 5 >> .data 00000060 0000000002022cc0 0000000002022cc0 01f43cc0 >> 2**3 >> CONTENTS, ALLOC, LOAD, DATA >> >> Only .head.text gets loaded at e0000, and it is basically just a few >> bytes which sets-up registers and jump to .text segment (at 0201d000 >> in this case). > > Actually, you're looking at the wrong thing, you want to look at the > program header (the segments) not the section header. It's the program > header we load. If I extract this from the current debian kernel we > get > > jejb@ion:~/git/linux-build/arch/parisc/boot/compressed> readelf -l /boot/vmlinuz-4.19.0-5-parisc64-smp > > Elf file type is EXEC (Executable file) > Entry point 0xe0000 > There are 4 program headers, starting at offset 64 > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > PHDR 0x0000000000000040 0x0000000000001040 0x0000000000000000 > 0x00000000000000e0 0x00000000000000e0 R E 0x8 > LOAD 0x0000000000001000 0x00000000000e0000 0x00000000000e0000 > 0x00000000000004d8 0x00000000000004d8 RWE 0x1000 > LOAD 0x0000000000002000 0x0000000001400000 0x0000000001400000 > 0x00000000003dd46c 0x00000000003e1000 RWE 0x1000 > GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000000000 0x0000000000000000 RWE 0x10 > > Section to Segment mapping: > Segment Sections... > 00 > 01 .head.text .opd .dlt > 02 .text .data .rodata .eh_frame .bss > 03 > > The two LOAD sections corresponding to what PALO actually loads. The > problem happens if the length of the first load section is bigger than > 0x20000. What exactly is the problem if the first section is bigger than 0x20000? > Now if you look what happens after your change: > jejb@ion:~/git/linux-build/build/parisc64/arch/parisc/boot> readelf -l bzImage Ok - bzImage is the same as ./vmlinuz. > Elf file type is EXEC (Executable file) > Entry point 0xe0000 > There are 4 program headers, starting at offset 64 > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > PHDR 0x0000000000000040 0x0000000000001040 0x0000000000000000 > 0x00000000000000e0 0x00000000000000e0 R E 0x8 > LOAD 0x0000000000001000 0x00000000000e0000 0x00000000000e0000 > 0x00000000004ae760 0x00000000004ae760 RWE 0x1000 > LOAD 0x00000000004b0000 0x000000000118a000 0x000000000118a000 > 0x0000000000006044 0x000000000000a000 RWE 0x1000 > GNU_STACK 0 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000000000 0x0000000000000000 RWE 0x10 > > Section to Segment mapping: > Segment Sections... > 00 > 01 .head.text .opd .dlt .rodata.compressed > 02 .text .data .rodata .eh_frame .bss > 03 > > So the first section tries to load between 0x000e0000-0x0058e760 and > that's overwritten at 0x00100000 when the decompression starts because > 0x00100000 is our KERNEL_BINARY_TEXT_START. The decompression decompresses the image from .rodata.compressed to an area behind .bss. So, "vmlinux" ends up behind .bss for further processing. This "vmlinux" (which can have multiple ELF sections) is then started at the high address. That address is way above the 0x00100000 or KERNEL_BINARY_TEXT_START. It then finally moves itself (the ELF sections) to 0x00100000. > The result for me is that > I get the Decompressing linux ... message followed by a HPMC. It actually does boot for me and Sven without a HPMC. The decompression is slow (~40 seconds on my c3000 for 160MB). I still *believe* you are facing a HPMC because of other reasons. On which machine do you start. How much memory? Helge
On 31.07.19 23:13, Helge Deller wrote: > On 31.07.19 23:08, Sven Schnelle wrote: >> Hi, >> >> On Wed, Jul 31, 2019 at 02:01:34PM -0700, James Bottomley wrote: >>> On Wed, 2019-07-31 at 21:44 +0200, Sven Schnelle wrote: >>>> Hi James, >>>> >>>> On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley wrote: >>>> >>>>> What about causing the compressed make to build both a stripped and >>>>> a non-stripped bzImage (say sbzImage and bzImage). That way you >>>>> always have the stripped one available for small size things like >>>>> boot from tape or DVD? but in the usual case we use the bzImage >>>>> with full contents. >>>> >>>> In that case we would also need to build two lifimages - how about >>>> adding a config option option? Something like "Strip debug >>>> information from compressed kernel images"? >>> >>> Actually, I just looked at what x86 does. It has this in the >>> arch/x86/boot/compressed/Makefile: >>> >>> OBJCOPYFLAGS_vmlinux.bin := -R .comment -S >>> $(obj)/vmlinux.bin: vmlinux FORCE >>> $(call if_changed,objcopy) >>> >>> So it basically strips all the debug information from the kernel before >>> compressing, which argues there's no need to retain the information >>> because x86 doesn't bother. >> >> Nice. So we could convince Helge by saying "Look, x86 is also stripping it"! :-) > > I'm fine with doing exactly why x86 does :-) Attached is the revised patch, and it gets the compressed kernel down from 32MB to 3.8MB. Helge
On Wed, 2019-07-31 at 23:44 +0200, Helge Deller wrote: > On 31.07.19 22:49, James Bottomley wrote: > > On Wed, 2019-07-31 at 22:19 +0200, Helge Deller wrote: > > > On 31.07.19 21:56, James Bottomley wrote: > > > > On Wed, 2019-07-31 at 21:46 +0200, Helge Deller wrote: > > > > > On 31.07.19 21:44, Sven Schnelle wrote: > > > > > > Hi James, > > > > > > > > > > > > On Wed, Jul 31, 2019 at 12:40:12PM -0700, James Bottomley > > > > > > wrote: > > > > > > > > > > > > > What about causing the compressed make to build both a > > > > > > > stripped and a non-stripped bzImage (say sbzImage and > > > > > > > bzImage). That way you always have the stripped one > > > > > > > available for small size things like boot from tape or > > > > > > > DVD? but in the usual case we use the bzImage with full > > > > > > > contents. > > > > > > > > > > > > In that case we would also need to build two lifimages - > > > > > > how > > > > > > about adding a config option option? Something like "Strip > > > > > > debug information from compressed kernel images"? > > > > > > > > > > I agree, two lifimages don't make sense. Only one vmlinuz > > > > > gets > > > > > installed. Instead of the config option, I tink my latest > > > > > patch > > > > > is better. > > > > > > > > It doesn't solve the problem that if a stripped compressed > > > > image is > > > > > > > > > > > > > 128kb then it overwrites the decompress area starting at > > > > 0x00100000 > > > > so we can't decompress the end because we've already > > > > overwritten it > > > > before the decompressor gets to it. > > > > > > I don't get this point. > > > hppa64-linux-gnu-objdump -h vmlinuz > > > shows: > > > Sections: > > > Idx > > > Name Size VMA LMA File > > > off Algn > > > 0 > > > .head.text 00000084 00000000000e0000 00000000000e0000 00001 > > > 000 > > > 2**2 > > > CONTENTS, ALLOC, LOAD, READONLY, CODE > > > 1 > > > .opd 00000340 00000000000e0090 00000000000e0090 00001 > > > 090 > > > 2**3 > > > CONTENTS, ALLOC, LOAD, DATA > > > 2 > > > .dlt 00000160 00000000000e03d0 00000000000e03d0 00001 > > > 3d0 > > > 2**3 > > > CONTENTS, ALLOC, LOAD, DATA > > > 3 .rodata.compressed > > > 01f3c2b0 00000000000e0530 00000000000e0530 00001530 2**0 > > > CONTENTS, ALLOC, LOAD, DATA > > > 4 > > > .text 00005cc0 000000000201d000 000000000201d000 01f3e > > > 000 > > > 2**7 > > > CONTENTS, ALLOC, LOAD, READONLY, CODE > > > 5 > > > .data 00000060 0000000002022cc0 0000000002022cc0 01f43 > > > cc0 > > > 2**3 > > > CONTENTS, ALLOC, LOAD, DATA > > > > > > Only .head.text gets loaded at e0000, and it is basically just a > > > few > > > bytes which sets-up registers and jump to .text segment (at > > > 0201d000 > > > in this case). > > > > Actually, you're looking at the wrong thing, you want to look at > > the > > program header (the segments) not the section header. It's the > > program > > header we load. If I extract this from the current debian kernel > > we > > get > > > > jejb@ion:~/git/linux-build/arch/parisc/boot/compressed> readelf -l > > /boot/vmlinuz-4.19.0-5-parisc64-smp > > > > Elf file type is EXEC (Executable file) > > Entry point 0xe0000 > > There are 4 program headers, starting at offset 64 > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Ali > > gn > > PHDR 0x0000000000000040 0x0000000000001040 > > 0x0000000000000000 > > 0x00000000000000e0 0x00000000000000e0 R E 0x8 > > LOAD 0x0000000000001000 0x00000000000e0000 > > 0x00000000000e0000 > > 0x00000000000004d8 > > 0x00000000000004d8 RWE 0x1000 > > LOAD 0x0000000000002000 0x0000000001400000 > > 0x0000000001400000 > > 0x00000000003dd46c > > 0x00000000003e1000 RWE 0x1000 > > GNU_STACK 0x0000000000000000 0x0000000000000000 > > 0x0000000000000000 > > 0x0000000000000000 > > 0x0000000000000000 RWE 0x10 > > > > Section to Segment mapping: > > Segment Sections... > > 00 > > 01 .head.text .opd .dlt > > 02 .text .data .rodata .eh_frame .bss > > 03 > > > > The two LOAD sections corresponding to what PALO actually loads. > > The > > problem happens if the length of the first load section is bigger > > than > > 0x20000. > > What exactly is the problem if the first section is bigger than > 0x20000? > > > Now if you look what happens after your change: > > jejb@ion:~/git/linux-build/build/parisc64/arch/parisc/boot> readelf > > -l bzImage > > Ok - bzImage is the same as ./vmlinuz. > > > Elf file type is EXEC (Executable file) > > Entry point 0xe0000 > > There are 4 program headers, starting at offset 64 > > > > Program Headers: > > Type Offset VirtAddr PhysAddr > > FileSiz MemSiz Flags Ali > > gn > > PHDR 0x0000000000000040 0x0000000000001040 > > 0x0000000000000000 > > 0x00000000000000e0 0x00000000000000e0 R E 0x8 > > LOAD 0x0000000000001000 0x00000000000e0000 > > 0x00000000000e0000 > > 0x00000000004ae760 > > 0x00000000004ae760 RWE 0x1000 > > LOAD 0x00000000004b0000 0x000000000118a000 > > 0x000000000118a000 > > 0x0000000000006044 > > 0x000000000000a000 RWE 0x1000 > > GNU_STACK 0 0x0000000000000000 0x0000000000000000 > > 0x0000000000000000 > > 0x0000000000000000 > > 0x0000000000000000 RWE 0x10 > > > > Section to Segment mapping: > > Segment Sections... > > 00 > > 01 .head.text .opd .dlt .rodata.compressed > > 02 .text .data .rodata .eh_frame .bss > > 03 > > > > So the first section tries to load between 0x000e0000-0x0058e760 > > and > > that's overwritten at 0x00100000 when the decompression starts > > because > > 0x00100000 is our KERNEL_BINARY_TEXT_START. > > The decompression decompresses the image from .rodata.compressed > to an area behind .bss. > So, "vmlinux" ends up behind .bss for further processing. > This "vmlinux" (which can have multiple ELF sections) is then started > at the high address. > That address is way above the 0x00100000 or KERNEL_BINARY_TEXT_START. > It then finally moves itself (the ELF sections) to 0x00100000. > > > The result for me is that > > I get the Decompressing linux ... message followed by a HPMC. > > It actually does boot for me and Sven without a HPMC. > The decompression is slow (~40 seconds on my c3000 for 160MB). > I still *believe* you are facing a HPMC because of other reasons. > On which machine do you start. > How much memory? This turned out to be a very eccentric bug. Apparently we don't have an archclean target in our arch/parisc/Makefile, so files in there never get cleaned out by make mrproper. This, in turn means that the sizes.h file in arch/parisc/boot/compressed never gets removed and worse, when you transition to an O=build/parisc[64] build model it overrides the generated file. The upshot being my bzImage was building with a SZ_end that was too small. I fixed it by making mrproper clean everyting. James --- diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile index 8acb8fa1f8d6..945952166468 100644 --- a/arch/parisc/Makefile +++ b/arch/parisc/Makefile @@ -182,5 +182,8 @@ define archhelp @echo ' zinstall - Install compressed vmlinuz kernel' endef +archclean: + $(Q)$(MAKE) $(clean)=$(boot) + archheaders: $(Q)$(MAKE) $(build)=arch/parisc/kernel/syscalls all
Hi Helge, On Wed, Jul 31, 2019 at 11:51:16PM +0200, Helge Deller wrote: > > Attached is the revised patch, and it gets the compressed kernel down > from 32MB to 3.8MB. > Works for me, thanks! Regards Sven
diff --git a/arch/parisc/boot/compressed/vmlinux.lds.S b/arch/parisc/boot/compressed/vmlinux.lds.S index bfd7872739a3..5841aa373c03 100644 --- a/arch/parisc/boot/compressed/vmlinux.lds.S +++ b/arch/parisc/boot/compressed/vmlinux.lds.S @@ -42,12 +42,6 @@ SECTIONS #endif _startcode_end = .; - /* vmlinux.bin.gz is here */ - . = ALIGN(8); - .rodata.compressed : { - *(.rodata.compressed) - } - /* bootloader code and data starts behind area of extracted kernel */ . = (SZ_end - SZparisc_kernel_start + KERNEL_BINARY_TEXT_START); @@ -73,6 +67,12 @@ SECTIONS *(.rodata.*) _erodata = . ; } + /* vmlinux.bin.gz is here */ + . = ALIGN(8); + .rodata.compressed : { + *(.rodata.compressed) + } + . = ALIGN(8); .bss : { _bss = . ;