Message ID | 20200430161438.17640-1-alpernebiyasak@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Prefer working VT console over SPCR and device-tree chosen stdout-path | expand |
On Thu, Apr 30, 2020 at 07:14:34PM +0300, Alper Nebi Yasak wrote: First of all I see only cover letter and one out of 3 patches. > I recently experienced some trouble with setting up an encrypted-root > system, my Chromebook Plus (rk3399-gru-kevin, ARM64) would appear to > hang where it should have asked for an encryption passphrase; and I > eventually figured out that the kernel preferred the serial port > (inaccessible to me) over the built-in working display/keyboard and was > probably asking there. "probably". Please, confirm that first. Also, without command line it's hard to say what you have asked kernel to do. > Running plymouth in the initramfs solves that specific problem, but > both the documentation and tty-related kconfig descriptions imply that > /dev/console should be tty0 if graphics are working, CONFIG_VT_CONSOLE > is enabled and no explicit console argument is given in the kernel > commandline. What is plymouth? > However, I'm seeing different behaviour on systems with SPCR (as in QEMU > aarch64 virtual machines) and/or a device-tree chosen stdout-path node > (as in most arm/arm64 devices). On these machines, depending on the > console argument, the contents of the /proc/consoles file are: > > | "console=tty0" | (no console arg) | > ------------------+-----------------------+-----------------------+ > QEMU VM | tty0 -WU (EC p ) | ttyAMA0 -W- (EC a) | > (w/ SPCR) | ttyAMA0 -W- (E a) | | > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | ttyS2 -W- (EC p a) | > (w/ stdout-path) | | tty0 -WU (E ) | > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | tty0 -WU (EC p ) | > (w/o either) | | | > ------------------+-----------------------+-----------------------+ either == SPCR or stdout-path? > This patchset tries to ensure that VT is preferred in those conditions > even in the presence of firmware-mandated serial consoles. This sounds completely wrong. serial should be preferred over vt due to very debugging on early stages and SPCR is exactly for that. > These should > cleanly apply onto next-20200430. > > More discussion due to or about the console confusion on ARM64: > - My Debian bug report about the initramfs prompts [0] > - Fedora test issue arising from ARM64 QEMU machines having SPCR [1] > - Debian-installer discussion on what to do with multiple consoles [2] Maybe you should figure out the real root cause? > [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=952452 > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1661288 > [2] https://lists.debian.org/debian-boot/2019/01/msg00184.html
On 30/04/2020 19:44, Andy Shevchenko wrote: > First of all I see only cover letter and one out of 3 patches. Apologies, the tool I've used to send the patches (U-Boot's patman) Cc-ed the scripts/get_maintainer.pl output per-patch, instead of per-series as I had assumed it would. This was the first time I tried it, I'll keep that in mind. Here are links to all four emails: https://lore.kernel.org/linux-serial/20200430161438.17640-1-alpernebiyasak@gmail.com/ https://lore.kernel.org/linux-serial/20200430161438.17640-2-alpernebiyasak@gmail.com/ https://lore.kernel.org/linux-serial/20200430161438.17640-3-alpernebiyasak@gmail.com/ https://lore.kernel.org/linux-serial/20200430161438.17640-4-alpernebiyasak@gmail.com/ Or I can resend the last two patches to you, or resend all the parts to everyone again. >> eventually figured out that the kernel preferred the serial port >> (inaccessible to me) over the built-in working display/keyboard and was >> probably asking there. > > "probably". Please, confirm that first. > Also, without command line it's hard to say what you have asked kernel to do. I was trying to boot a Debian userspace with cryptsetup, with the kernel command line: root=/dev/mapper/sda3_crypt quiet splash The Debian initramfs handles most of the work (the password prompt, device mounts, etc.). When I used the same kernel/initramfs/rootfs on a QEMU aarch64 VM, it only prompted on the serial console instead the framebuffer. I'm assuming the same thing happens on my hardware as well. I can also ask the Debian initramfs to launch a shell by adding "break" to the command line, which won't be printed on my device's screen unless I also add "console=tty0". That shell also only appears on the serial console on the QEMU aarch64 VM, unless I again add "console=tty0". This is my primary computer and I'd prefer not dismantling it, so my findings above are the best I believe I can do to confirm it now. I'm hoping other people would be interested in this, and would test more than I can. >> Running plymouth in the initramfs solves that specific problem, but > > What is plymouth? Plymouth is a userspace program that's famous for showing a splash animation during boot, but in this context: it handles user-interaction that might need to happen while the initramfs is running, by printing messages and prompts and reading user input to/from all consoles. >> ------------------+-----------------------+-----------------------+ >> Chromebook Plus | tty0 -WU (EC p ) | tty0 -WU (EC p ) | >> (w/o either) | | | >> ------------------+-----------------------+-----------------------+ > > either == SPCR or stdout-path? As in "When the device has no SPCR _and_ no chosen stdout-path". >> This patchset tries to ensure that VT is preferred in those conditions >> even in the presence of firmware-mandated serial consoles. > > This sounds completely wrong. serial should be preferred over vt due to very > debugging on early stages and SPCR is exactly for that. I'm saying that from a userspace perspective, and the patches explicitly try to switch to the vt only after a real framebuffer is initialized. So if I did it right, it would still use SPCR/stdout-path's console during the early stages. (I admit I haven't adjusted to talking within a kernel context yet). In all honesty, I'm not sure if this is even considered a kernel bug, let alone my patches a correct solution; hence the RFC PATCH as an attempt at demonstrating this can be "fixed" in kernel. > Maybe you should figure out the real root cause? Thanks for the reply. Any ideas on what else could be causing this behaviour?
On (20/04/30 19:14), Alper Nebi Yasak wrote: > | "console=tty0" | (no console arg) | > ------------------+-----------------------+-----------------------+ > QEMU VM | tty0 -WU (EC p ) | ttyAMA0 -W- (EC a) | > (w/ SPCR) | ttyAMA0 -W- (E a) | | > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | ttyS2 -W- (EC p a) | > (w/ stdout-path) | | tty0 -WU (E ) | > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | tty0 -WU (EC p ) | > (w/o either) | | | > ------------------+-----------------------+-----------------------+ > > This patchset tries to ensure that VT is preferred in those conditions > even in the presence of firmware-mandated serial consoles. These should > cleanly apply onto next-20200430. Well, if there is a "mandated console", then why would we prefer any other console? -ss
On 01/05/2020 04:30, Sergey Senozhatsky wrote:> Well, if there is a "mandated console", then why would we prefer > any other console? From what I understand, the firmware provides serial console settings to be used as the preferred _serial_ console (where it would be OK to switch to graphical consoles later on) and the kernel currently understands that such a console should be the preferred _system_ console (always preferred over even graphical ones). By "mandated" I'm referring to the kernel's current behavior, not to (in my understanding) the firmware's intentions. Even if the firmware/specifications is really asking the kernel to (tell userspace programs to) always use the serial console instead of the framebuffer console, while on e.g. a laptop-like device intended to be used with a keyboard and display -- is that the correct thing to do? From the userspace, under the conditions: - CONFIG_VT_CONSOLE is enabled - There is a working graphics adapter and a display - There is no console argument given in the kernel command line I expect that: - tty0 is included in the /proc/consoles list [1] - tty0 is the preferred console and /dev/console refers to it [2] With SPCR both are false, and with stdout-path only the second is false. Again, I'm OK with these being false during earlier stages until graphics start working, but I'm arguing they should be true after then. In the patches I tried to keep these serial consoles still enabled and preferred during early stages of boot, by trying to switch to vt only after a real working graphical backend for it is initialized. I mean, if my expectations are unreasonable and the current kernel behaviour is considered correct, these patches would be conceptually wrong; so please tell me if I got anything right/wrong in all this. [1] From the descripion of CONFIG_VT_CONSOLE: > [...] If you answer Y here, a virtual terminal (the device used to > interact with a physical terminal) can be used as system console. > [...] you should say Y here unless you want the kernel messages be > output only to a serial port [...] and by "as a prerequisite of [2]" [2] From the descripion of CONFIG_VT_CONSOLE: > If you do say Y here, by default the currently visible virtual > terminal (/dev/tty0) will be used as system console. You can change > that with a kernel command line option such as "console=tty3" which > would use the third virtual terminal as system console. [...] I'm assuming "by default" here means "without console arguments" regardless of firmware requests. This paragraph (with small changes) is repeated on many other Kconfig descriptions (drivers/tty/serial/Kconfig, drivers/tty/serial/8250/Kconfig, arch/sparc/Kconfig from grepping for '/dev/tty0' on **/Kconfig). From Documentation/admin-guide/serial-console.rst: > You can specify multiple console= options on the kernel command line. > [...] > Note that you can only define one console per device type (serial, video). > > If no console device is specified, the first device found capable of > acting as a system console will be used. At this time, the system > first looks for a VGA card and then for a serial port. So if you don't > have a VGA card in your system the first serial port will automatically > become the console. and later on: > Note that if you boot without a ``console=`` option (or with > ``console=/dev/tty0``), ``/dev/console`` is the same as ``/dev/tty0``. > In that case everything will still work.
On Fri, May 1, 2020 at 2:11 PM Alper Nebi Yasak <alpernebiyasak@gmail.com> wrote: > On 01/05/2020 04:30, Sergey Senozhatsky wrote: > I'm assuming "by default" here means "without console arguments" > regardless of firmware requests. This paragraph (with small changes) is > repeated on many other Kconfig descriptions (drivers/tty/serial/Kconfig, > drivers/tty/serial/8250/Kconfig, arch/sparc/Kconfig from grepping for > '/dev/tty0' on **/Kconfig). > > From Documentation/admin-guide/serial-console.rst: > > > You can specify multiple console= options on the kernel command line. > > [...] > > Note that you can only define one console per device type (serial, video). > > > > If no console device is specified, the first device found capable of > > acting as a system console will be used. At this time, the system > > first looks for a VGA card and then for a serial port. So if you don't > > have a VGA card in your system the first serial port will automatically > > become the console. > > and later on: > > > Note that if you boot without a ``console=`` option (or with > > ``console=/dev/tty0``), ``/dev/console`` is the same as ``/dev/tty0``. > > In that case everything will still work. I'm wondering if behaviour is changed if you put console=tty1 instead of console=tty0.
On 01/05/2020 16:16, Andy Shevchenko wrote: > On Fri, May 1, 2020 at 2:11 PM Alper Nebi Yasak > <alpernebiyasak@gmail.com> wrote: >> I'm assuming "by default" here means "without console arguments" >> regardless of firmware requests. This paragraph (with small changes) is >> repeated on many other Kconfig descriptions (drivers/tty/serial/Kconfig, >> drivers/tty/serial/8250/Kconfig, arch/sparc/Kconfig from grepping for >> '/dev/tty0' on **/Kconfig). >> >> From Documentation/admin-guide/serial-console.rst: >> >>> You can specify multiple console= options on the kernel command line. >>> [...] >>> Note that you can only define one console per device type (serial, video). >>> >>> If no console device is specified, the first device found capable of >>> acting as a system console will be used. At this time, the system >>> first looks for a VGA card and then for a serial port. So if you don't >>> have a VGA card in your system the first serial port will automatically >>> become the console. >> >> and later on: >> >>> Note that if you boot without a ``console=`` option (or with >>> ``console=/dev/tty0``), ``/dev/console`` is the same as ``/dev/tty0``. >>> In that case everything will still work. > > I'm wondering if behaviour is changed if you put console=tty1 instead > of console=tty0. Just tested again with the QEMU aarch64 VM. Comparing console=tty1 and console=tty0 cases: /proc/consoles has tty1 instead of tty0 (both also has ttyAMA0), and `echo '/dev/console is here' >>/dev/console` goes to vt1 instead of the currently visible vt. Same difference before and after this patchset.
On Thu 2020-04-30 19:14:34, Alper Nebi Yasak wrote: > I recently experienced some trouble with setting up an encrypted-root > system, my Chromebook Plus (rk3399-gru-kevin, ARM64) would appear to > hang where it should have asked for an encryption passphrase; and I > eventually figured out that the kernel preferred the serial port > (inaccessible to me) over the built-in working display/keyboard and was > probably asking there. > > Running plymouth in the initramfs solves that specific problem, but > both the documentation and tty-related kconfig descriptions imply that > /dev/console should be tty0 if graphics are working, CONFIG_VT_CONSOLE > is enabled and no explicit console argument is given in the kernel > commandline. > > However, I'm seeing different behaviour on systems with SPCR (as in QEMU > aarch64 virtual machines) and/or a device-tree chosen stdout-path node > (as in most arm/arm64 devices). On these machines, depending on the > console argument, the contents of the /proc/consoles file are: I dug many times into the history of the console registration code. The following table mostly confirms my expectations. > | "console=tty0" | (no console arg) | > ------------------+-----------------------+-----------------------+ > QEMU VM | tty0 -WU (EC p ) | ttyAMA0 -W- (EC a) | > (w/ SPCR) | ttyAMA0 -W- (E a) | > | The SPCR handling is inconsistent over architectures, see https://lkml.kernel.org/r/20180830123849.26163-1-prarit@redhat.com IMHO, arm developers decided that consoles defined by SPCR are always enabled when existing. In 1st column: tty0 is the preferred console because it is defined on the commandline. In 2nd column: tty0 is not enabled at all because another console was defined by SPCR. Note that ttySX and ttyX consoles are registered only as a fallback when there is no other console defined. The following code is responsible for the fallback, see register_console() /* * See if we want to use this console driver. If we * didn't select a console we take the first one * that registers here. */ if (!has_preferred) { if (newcon->index < 0) newcon->index = 0; if (newcon->setup == NULL || newcon->setup(newcon, NULL) == 0) { newcon->flags |= CON_ENABLED; if (newcon->device) { newcon->flags |= CON_CONSDEV; has_preferred = true; } } } > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | ttyS2 -W- (EC p a) | > (w/ stdout-path) | | tty0 -WU (E ) | Hmm, of_console_check() explicitly ignores the console defined by stdout-path when there is a console on the commandline. This explains 1st column. I am not sure about 2nd column. My guess is that ttyX consoles are tried first. tty0 is registered as a fallback because there is no other console at the moment. ttyS2 is tried later and it is registered because it is in stdout-patch and there is no console in the command line. It is somehow consistent with CONFIG_VT_CONSOLE description. Sadly, it is different logic than with SPCR :-( > ------------------+-----------------------+-----------------------+ > Chromebook Plus | tty0 -WU (EC p ) | tty0 -WU (EC p ) | > (w/o either) | | | > ------------------+-----------------------+-----------------------+ This variant is easy and everyone would probably expect this. Regarding the description of CONFIG_VT_CONSOLE option. I am afraid that it was created and true only before SPCR and device tree support was introduced. Now, it is really sad that SPCR and device tree have different behavior even across architectures. But I am afraid that we could not change it without breaking many setups. The only common rules are: + The last console on the command line should always be the preferred one when defined. + Consoles defined by the device (SPCR, device tree) are used when there is no commandline. + ttyX or ttySX are used as a fallback when nothing else is defined. My suggestion is: + Fix SPCR setting or device tree of your device when the defaults are not as expected. + Use command line to force your value when the defaults are not as expected and you could not change them. I am afraid that we could not fix your problem on the kernel side. It would broke other setups that depend on the existing behavior. Best Regards, Petr
On Wed, 2020-05-13 at 16:37 +0200, Petr Mladek wrote: > The only common rules are: > > + The last console on the command line should always be the > preferred one when defined. > > + Consoles defined by the device (SPCR, device tree) are used > when there is no commandline. With the exception that on x86, SPCR is only used for early_con, we don't do add_preferred_console() at all for it. I sort-of understand why... the track record on BIOS quality out there being what it is, I could see this causing a number of systems start sending the console to a non-existent or non-wired serial port instead of the tty/gpu because the BIOS leave SPCR set/enabled for no reason. It may or may not be the case in practice but I don't see how we can figure that out without either a large campain of data collection from tons of systems (which will miss plenty) or just taking the chance & breaking people and see who screams :-) Cheers, Ben.
On 13/05/2020 17:37, Petr Mladek wrote: > On Thu 2020-04-30 19:14:34, Alper Nebi Yasak wrote: >> | "console=tty0" | (no console arg) | >> ------------------+-----------------------+-----------------------+ >> QEMU VM | tty0 -WU (EC p ) | ttyAMA0 -W- (EC a) | >> (w/ SPCR) | ttyAMA0 -W- (E a) | >> | > > The SPCR handling is inconsistent over architectures, see > https://lkml.kernel.org/r/20180830123849.26163-1-prarit@redhat.com > > IMHO, arm developers decided that consoles defined by SPCR are always > enabled when existing. I'm OK with those being enabled. Though, I hope "not registering tty0" wasn't an explicit decision, but maybe an oversight/trade-off due to assuming SPCR code will only run on servers without displays (where tty0 wouldn't matter). (I understand it might be too late to change that.) So I'd want the 2nd column to be: tty0(EC) ttyAMA0(E) at best, and ttyAMA0(EC) tty0(E) at worst. > In 1st column: tty0 is the preferred console because it is defined > on the commandline. > > In 2nd column: tty0 is not enabled at all because another console was > defined by SPCR. Note that ttySX and ttyX consoles are registered only > as a fallback when there is no other console defined. > > The following code is responsible for the fallback, see register_console() > > /* > * See if we want to use this console driver. If we > * didn't select a console we take the first one > * that registers here. > */ > if (!has_preferred) { > if (newcon->index < 0) > newcon->index = 0; > if (newcon->setup == NULL || > newcon->setup(newcon, NULL) == 0) { > newcon->flags |= CON_ENABLED; > if (newcon->device) { > newcon->flags |= CON_CONSDEV; > has_preferred = true; > } > } > } > > >> ------------------+-----------------------+-----------------------+ >> Chromebook Plus | tty0 -WU (EC p ) | ttyS2 -W- (EC p a) | >> (w/ stdout-path) | | tty0 -WU (E ) | > > Hmm, of_console_check() explicitly ignores the console defined by > stdout-path when there is a console on the commandline. This explains > 1st column. > > I am not sure about 2nd column. My guess is that ttyX consoles are > tried first. tty0 is registered as a fallback because there is no > other console at the moment. ttyS2 is tried later and it is > registered because it is in stdout-patch and there is no console > in the command line. It is somehow consistent with CONFIG_VT_CONSOLE > description. > > Sadly, it is different logic than with SPCR :-( I like the fact that this one has tty0. For example, Debian's installer iterates over /proc/consoles and launches itself on all the consoles it finds there, so it wouldn't launch on my chromebook's screen if tty0 wasn't included (just like it doesn't launch on a QEMU aarch64 VM's framebuffer). >> ------------------+-----------------------+-----------------------+ >> Chromebook Plus | tty0 -WU (EC p ) | tty0 -WU (EC p ) | >> (w/o either) | | | >> ------------------+-----------------------+-----------------------+ > > This variant is easy and everyone would probably expect this. I think things run roughly in the following order (from what I can decipher from kernel messages) and I think it matches your explanations: | ACPI SPCR | dt chosen stdout-path | +=================================+=================================+ | acpi_parse_spcr() | | | -> add_preferred_console(uart0) | | | (if not on x86) | | +---------------------------------+---------------------------------+ | console_setup() | | -> add_preferred_console(tty0) | | (if console=tty0) | +---------------------------------+---------------------------------+ | register_console(vt) | +---------------------------------+---------------------------------+ | | of_console_check() | | | -> add_preferred_console(uart2) | | | (if no console arg) | +---------------------------------+---------------------------------+ | register_console(serial) | +---------------------------------+---------------------------------+ > Regarding the description of CONFIG_VT_CONSOLE option. I am afraid > that it was created and true only before SPCR and device tree support > was introduced. OK. Assuming these changes won't go any further, maybe I'll try documenting the current behavior in relevant places. > Now, it is really sad that SPCR and device tree have different > behavior even across architectures. But I am afraid that we could > not change it without breaking many setups. > > The only common rules are: > > + The last console on the command line should always be the > preferred one when defined. > > + Consoles defined by the device (SPCR, device tree) are used > when there is no commandline. > > + ttyX or ttySX are used as a fallback when nothing else is defined. > > > My suggestion is: > > + Fix SPCR setting or device tree of your device when the defaults > are not as expected. Maybe I can get QEMU's SPCR use conditional on the existence a framebuffer, and get distributions to remove stdout-path from certain device-trees; but that would disable the serial console completely (instead of having it enabled where tty0 is still preferred). > + Use command line to force your value when the defaults are not > as expected and you could not change them. This works; but I'd have to know the machine's serial configuration in advance to put it in the cmdline as "console=<serial> console=tty0", or lose the serial console as in the above. (A "console=dt" like that "console=spcr" patch you linked to would be useful here if it existed.) Both seem imperfect in that sense, but tolerable. > I am afraid that we could not fix your problem on the kernel side. It > would broke other setups that depend on the existing behavior. > > Best Regards, > Petr Thanks for the detailed reply.
On Fri 2020-05-15 22:27:02, Alper Nebi Yasak wrote: > On 13/05/2020 17:37, Petr Mladek wrote: > > On Thu 2020-04-30 19:14:34, Alper Nebi Yasak wrote: > I think things run roughly in the following order (from what I can > decipher from kernel messages) and I think it matches your explanations: > > | ACPI SPCR | dt chosen stdout-path | > +=================================+=================================+ > | acpi_parse_spcr() | | > | -> add_preferred_console(uart0) | | > | (if not on x86) | | > +---------------------------------+---------------------------------+ > | console_setup() | > | -> add_preferred_console(tty0) | > | (if console=tty0) | > +---------------------------------+---------------------------------+ > | register_console(vt) | > +---------------------------------+---------------------------------+ > | | of_console_check() | > | | -> add_preferred_console(uart2) | > | | (if no console arg) | > +---------------------------------+---------------------------------+ > | register_console(serial) | > +---------------------------------+---------------------------------+ I was first a bit confused by the above table. The order looks fine but I was not sure about the indentation. I think that some more details are needed to get the picture and context. I see the following order in start_kernel(): 1. Add spcr consoles: by acpi_parse_spcr() called from setup_arch(). 2. Add and register early consoles: by parse_early_param() 3. Add normal consoles from command line: by parse_args() 4. Register tty console: by vty_init() called via long chain from fs_initcall(chr_dev_init). It seems to be init call in 5th round, see include/linux/init.h 5. Register other (serial) consoles are most likely registered from device_initcall() in 6th round, see include/linux/init.h. The consoles defined by the device tree are not added directly. Instead, the probe() callbacks checks whether such console is selected in device tree by of_console_check() called from uart_add_one_port(). > > My suggestion is: > > > > + Fix SPCR setting or device tree of your device when the defaults > > are not as expected. > > Maybe I can get QEMU's SPCR use conditional on the existence a > framebuffer, and get distributions to remove stdout-path from certain > device-trees; but that would disable the serial console completely > (instead of having it enabled where tty0 is still preferred). I am afraid that this is a problem with many defaults. They might be good enough for many people but others would want something else. It might be acceptable to add consoles. But it might be a problem to remove consoles or change the currently preferred one. The only exception would be when most people are annoyed with the current default. But this need to be discussed with people familiar with the given architecture or device. > > + Use command line to force your value when the defaults are not > > as expected and you could not change them. > > This works; but I'd have to know the machine's serial configuration in > advance to put it in the cmdline as "console=<serial> console=tty0", or > lose the serial console as in the above. (A "console=dt" like that > "console=spcr" patch you linked to would be useful here if it existed.) The generic parameters: console=tty, console=serial, console=dt, console=spcr looks fine to me. IMHO, the only problem might be when a particular serial console drive is not able to guess reasonable defaults for the baud rate, etc. Best Regards, Petr