diff mbox series

[v3,2/7] lib/vsprintf: Add support for generic FOURCCs by extending %p4cc

Message ID E1osRXO-002mvw-Fp@rmk-PC.armlinux.org.uk (mailing list archive)
State New, archived
Headers show
Series Add Apple Mac System Management Controller GPIOs | expand

Commit Message

Russell King (Oracle) Nov. 8, 2022, 4:33 p.m. UTC
From: Hector Martin <marcan@marcan.st>

%p4cc is designed for DRM/V4L2 FOURCCs with their specific quirks, but
it's useful to be able to print generic 4-character codes formatted as
an integer. Extend it to add format specifiers for printing generic
32-bit FOURCCs with various endian semantics:

%p4ch   Host-endian
%p4cl	Little-endian
%p4cb	Big-endian
%p4cr	Reverse-endian

The endianness determines how bytes are interpreted as a u32, and the
FOURCC is then always printed MSByte-first (this is the opposite of
V4L/DRM FOURCCs). This covers most practical cases, e.g. %p4cr would
allow printing LSByte-first FOURCCs stored in host endian order
(other than the hex form being in character order, not the integer
value).

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 Documentation/core-api/printk-formats.rst | 32 +++++++++++++++++++
 lib/test_printf.c                         | 39 +++++++++++++++++++----
 lib/vsprintf.c                            | 35 ++++++++++++++++----
 3 files changed, 93 insertions(+), 13 deletions(-)

Comments

Petr Mladek Nov. 14, 2022, 3:34 p.m. UTC | #1
On Tue 2022-11-08 16:33:22, Russell King wrote:
> From: Hector Martin <marcan@marcan.st>
> 
> %p4cc is designed for DRM/V4L2 FOURCCs with their specific quirks, but
> it's useful to be able to print generic 4-character codes formatted as
> an integer. Extend it to add format specifiers for printing generic
> 32-bit FOURCCs with various endian semantics:
> 
> %p4ch   Host-endian
> %p4cl	Little-endian
> %p4cb	Big-endian
> %p4cr	Reverse-endian
> 
> The endianness determines how bytes are interpreted as a u32, and the
> FOURCC is then always printed MSByte-first (this is the opposite of
> V4L/DRM FOURCCs). This covers most practical cases, e.g. %p4cr would
> allow printing LSByte-first FOURCCs stored in host endian order
> (other than the hex form being in character order, not the integer
> value).
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Reviewed-by: Petr Mladek <pmladek@suse.com>

See one nit below.

> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -1762,27 +1762,50 @@ char *fourcc_string(char *buf, char *end, const u32 *fourcc,
>  	char output[sizeof("0123 little-endian (0x01234567)")];
>  	char *p = output;
>  	unsigned int i;
> +	bool pixel_fmt = false;
>  	u32 orig, val;
>  
> -	if (fmt[1] != 'c' || fmt[2] != 'c')
> +	if (fmt[1] != 'c')
>  		return error_string(buf, end, "(%p4?)", spec);
>  
>  	if (check_pointer(&buf, end, fourcc, spec))
>  		return buf;
>  
>  	orig = get_unaligned(fourcc);
> -	val = orig & ~BIT(31);
> +	switch (fmt[2]) {
> +	case 'h':
> +		val = orig;
> +		break;
> +	case 'r':
> +		val = orig = swab32(orig);

I do not like much these multi assignments. I think that the result
was not even defined in some older C standards. Though, I can't find
it now. And even make W=3 does not warn about it.

> +		break;
> +	case 'l':
> +		val = orig = le32_to_cpu(orig);
> +		break;
> +	case 'b':
> +		val = orig = be32_to_cpu(orig);
> +		break;

Best Regards,
Petr
Andy Shevchenko Nov. 14, 2022, 3:46 p.m. UTC | #2
On Mon, Nov 14, 2022 at 04:34:07PM +0100, Petr Mladek wrote:
> On Tue 2022-11-08 16:33:22, Russell King wrote:

...

> >  	orig = get_unaligned(fourcc);
> > -	val = orig & ~BIT(31);
> > +	switch (fmt[2]) {
> > +	case 'h':
> > +		val = orig;
> > +		break;
> > +	case 'r':
> > +		val = orig = swab32(orig);
> 
> I do not like much these multi assignments. I think that the result
> was not even defined in some older C standards. Though, I can't find
> it now. And even make W=3 does not warn about it.
> 
> > +		break;
> > +	case 'l':
> > +		val = orig = le32_to_cpu(orig);
> > +		break;
> > +	case 'b':
> > +		val = orig = be32_to_cpu(orig);
> > +		break;

Isn't easy to fix? Something like below?

	switch (fmt[2]) {
	case 'h':
		break;
	case 'r':
		orig = swab32(orig);
		break;
	case 'l':
		orig = le32_to_cpu(orig);
		break;
	case 'b':
		orig = be32_to_cpu(orig);
		break;

		...
	}
	val = orig;
Russell King (Oracle) Nov. 14, 2022, 4:15 p.m. UTC | #3
On Mon, Nov 14, 2022 at 04:34:07PM +0100, Petr Mladek wrote:
> On Tue 2022-11-08 16:33:22, Russell King wrote:
> > From: Hector Martin <marcan@marcan.st>
> > 
> > %p4cc is designed for DRM/V4L2 FOURCCs with their specific quirks, but
> > it's useful to be able to print generic 4-character codes formatted as
> > an integer. Extend it to add format specifiers for printing generic
> > 32-bit FOURCCs with various endian semantics:
> > 
> > %p4ch   Host-endian
> > %p4cl	Little-endian
> > %p4cb	Big-endian
> > %p4cr	Reverse-endian
> > 
> > The endianness determines how bytes are interpreted as a u32, and the
> > FOURCC is then always printed MSByte-first (this is the opposite of
> > V4L/DRM FOURCCs). This covers most practical cases, e.g. %p4cr would
> > allow printing LSByte-first FOURCCs stored in host endian order
> > (other than the hex form being in character order, not the integer
> > value).
> > 
> > Signed-off-by: Hector Martin <marcan@marcan.st>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> 
> See one nit below.
> 
> > --- a/lib/vsprintf.c
> > +++ b/lib/vsprintf.c
> > @@ -1762,27 +1762,50 @@ char *fourcc_string(char *buf, char *end, const u32 *fourcc,
> >  	char output[sizeof("0123 little-endian (0x01234567)")];
> >  	char *p = output;
> >  	unsigned int i;
> > +	bool pixel_fmt = false;
> >  	u32 orig, val;
> >  
> > -	if (fmt[1] != 'c' || fmt[2] != 'c')
> > +	if (fmt[1] != 'c')
> >  		return error_string(buf, end, "(%p4?)", spec);
> >  
> >  	if (check_pointer(&buf, end, fourcc, spec))
> >  		return buf;
> >  
> >  	orig = get_unaligned(fourcc);
> > -	val = orig & ~BIT(31);
> > +	switch (fmt[2]) {
> > +	case 'h':
> > +		val = orig;
> > +		break;
> > +	case 'r':
> > +		val = orig = swab32(orig);
> 
> I do not like much these multi assignments. I think that the result
> was not even defined in some older C standards. Though, I can't find
> it now. And even make W=3 does not warn about it.

Err.

It's been supported for decades. I learnt about it back in 1992 when
I was introduced to C by another experienced C programmer. It's been
supported in ANSI C compilers. The Norcroft C compiler (which is
strict ANSI) on Acorn platforms back in the late 1980s/1990s even
supported it.

I think you're a bit out of date.
Petr Mladek Nov. 14, 2022, 4:18 p.m. UTC | #4
On Mon 2022-11-14 17:46:28, Andy Shevchenko wrote:
> On Mon, Nov 14, 2022 at 04:34:07PM +0100, Petr Mladek wrote:
> > On Tue 2022-11-08 16:33:22, Russell King wrote:
> 
> ...
> 
> > >  	orig = get_unaligned(fourcc);
> > > -	val = orig & ~BIT(31);
> > > +	switch (fmt[2]) {
> > > +	case 'h':
> > > +		val = orig;
> > > +		break;
> > > +	case 'r':
> > > +		val = orig = swab32(orig);
> > 
> > I do not like much these multi assignments. I think that the result
> > was not even defined in some older C standards. Though, I can't find
> > it now. And even make W=3 does not warn about it.
> > 
> > > +		break;
> > > +	case 'l':
> > > +		val = orig = le32_to_cpu(orig);
> > > +		break;
> > > +	case 'b':
> > > +		val = orig = be32_to_cpu(orig);
> > > +		break;
> 
> Isn't easy to fix? Something like below?
> 
> 	switch (fmt[2]) {
> 	case 'h':
> 		break;
> 	case 'r':
> 		orig = swab32(orig);
> 		break;
> 	case 'l':
> 		orig = le32_to_cpu(orig);
> 		break;
> 	case 'b':
> 		orig = be32_to_cpu(orig);
> 		break;
> 
> 		...
> 	}
> 	val = orig;

I though the same. Unfortunately, this is not valid for the "case c:"
path where "orig" stays untouched:

	case 'c':
		/* Pixel formats are printed LSB-first */
		val = swab32(orig & ~BIT(31));
		pixel_fmt = true;
		break;

It is pity that "orig" is handled differently for the pixel and the generic
formats.

But I am afraid that there is no good solution. The code will
always be a mess when it tries to implement a messy definition.

It would be nice if the the FourCC format was used consistently
in all subsystems in the first place.


IMPORTANT: This brings the questions.

	   Is there actually a standard how to print the original
	   number in FourCC?

	   Do we really want to modify "orig" in the generic
	   implementation?

Best Regards,
Petr
Russell King (Oracle) Nov. 14, 2022, 4:46 p.m. UTC | #5
On Mon, Nov 14, 2022 at 04:15:50PM +0000, Russell King (Oracle) wrote:
> On Mon, Nov 14, 2022 at 04:34:07PM +0100, Petr Mladek wrote:
> > >  	orig = get_unaligned(fourcc);
> > > -	val = orig & ~BIT(31);
> > > +	switch (fmt[2]) {
> > > +	case 'h':
> > > +		val = orig;
> > > +		break;
> > > +	case 'r':
> > > +		val = orig = swab32(orig);
> > 
> > I do not like much these multi assignments. I think that the result
> > was not even defined in some older C standards. Though, I can't find
> > it now. And even make W=3 does not warn about it.
> 
> Err.
> 
> It's been supported for decades. I learnt about it back in 1992 when
> I was introduced to C by another experienced C programmer. It's been
> supported in ANSI C compilers. The Norcroft C compiler (which is
> strict ANSI) on Acorn platforms back in the late 1980s/1990s even
> supported it.
> 
> I think you're a bit out of date.

Oh, and it's not like there isn't precedent for doing this in
lib/vsprintf.c:

841a915d20c7 vsprintf: Do not have bprintf dereference pointers

+                                       len = copy = strlen(args);

If you grep lib/, there's many more examples. So, what is in Hectors
patch is in no way any different from lots of other examples already
merged into the kernel code.
Petr Mladek Nov. 22, 2022, 12:43 p.m. UTC | #6
On Mon 2022-11-14 16:15:50, Russell King (Oracle) wrote:
> On Mon, Nov 14, 2022 at 04:34:07PM +0100, Petr Mladek wrote:
> > On Tue 2022-11-08 16:33:22, Russell King wrote:
> > > From: Hector Martin <marcan@marcan.st>
> > > 
> > > %p4cc is designed for DRM/V4L2 FOURCCs with their specific quirks, but
> > > it's useful to be able to print generic 4-character codes formatted as
> > > an integer. Extend it to add format specifiers for printing generic
> > > 32-bit FOURCCs with various endian semantics:
> > > 
> > > %p4ch   Host-endian
> > > %p4cl	Little-endian
> > > %p4cb	Big-endian
> > > %p4cr	Reverse-endian
> > > 
> > > The endianness determines how bytes are interpreted as a u32, and the
> > > FOURCC is then always printed MSByte-first (this is the opposite of
> > > V4L/DRM FOURCCs). This covers most practical cases, e.g. %p4cr would
> > > allow printing LSByte-first FOURCCs stored in host endian order
> > > (other than the hex form being in character order, not the integer
> > > value).
> > > 
> > > Signed-off-by: Hector Martin <marcan@marcan.st>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > 
> > Reviewed-by: Petr Mladek <pmladek@suse.com>
> > 
> > See one nit below.
> > 
> > > --- a/lib/vsprintf.c
> > > +++ b/lib/vsprintf.c
> > > @@ -1762,27 +1762,50 @@ char *fourcc_string(char *buf, char *end, const u32 *fourcc,
> > >  	char output[sizeof("0123 little-endian (0x01234567)")];
> > >  	char *p = output;
> > >  	unsigned int i;
> > > +	bool pixel_fmt = false;
> > >  	u32 orig, val;
> > >  
> > > -	if (fmt[1] != 'c' || fmt[2] != 'c')
> > > +	if (fmt[1] != 'c')
> > >  		return error_string(buf, end, "(%p4?)", spec);
> > >  
> > >  	if (check_pointer(&buf, end, fourcc, spec))
> > >  		return buf;
> > >  
> > >  	orig = get_unaligned(fourcc);
> > > -	val = orig & ~BIT(31);
> > > +	switch (fmt[2]) {
> > > +	case 'h':
> > > +		val = orig;
> > > +		break;
> > > +	case 'r':
> > > +		val = orig = swab32(orig);
> > 
> > I do not like much these multi assignments. I think that the result
> > was not even defined in some older C standards. Though, I can't find
> > it now. And even make W=3 does not warn about it.
> 
> Err.
> 
> It's been supported for decades. I learnt about it back in 1992 when
> I was introduced to C by another experienced C programmer. It's been
> supported in ANSI C compilers. The Norcroft C compiler (which is
> strict ANSI) on Acorn platforms back in the late 1980s/1990s even
> supported it.

Ah, the problem probably was with a more complicated assignment.
For example, the result of the following code is not obvious:

	  a = b = a++;

Best Regards,
Petr
Petr Mladek Nov. 22, 2022, 2:49 p.m. UTC | #7
On Tue 2022-11-08 16:33:22, Russell King wrote:
> From: Hector Martin <marcan@marcan.st>
> 
> %p4cc is designed for DRM/V4L2 FOURCCs with their specific quirks, but
> it's useful to be able to print generic 4-character codes formatted as
> an integer. Extend it to add format specifiers for printing generic
> 32-bit FOURCCs with various endian semantics:
> 
> %p4ch   Host-endian
> %p4cl	Little-endian
> %p4cb	Big-endian
> %p4cr	Reverse-endian
> 
> The endianness determines how bytes are interpreted as a u32, and the
> FOURCC is then always printed MSByte-first (this is the opposite of
> V4L/DRM FOURCCs). This covers most practical cases, e.g. %p4cr would
> allow printing LSByte-first FOURCCs stored in host endian order
> (other than the hex form being in character order, not the integer
> value).
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
>  Documentation/core-api/printk-formats.rst | 32 +++++++++++++++++++
>  lib/test_printf.c                         | 39 +++++++++++++++++++----
>  lib/vsprintf.c                            | 35 ++++++++++++++++----
>  3 files changed, 93 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
> index dbe1aacc79d0..92a488884cf8 100644
> --- a/Documentation/core-api/printk-formats.rst
> +++ b/Documentation/core-api/printk-formats.rst
> @@ -625,6 +625,38 @@ Passed by reference.
>  	%p4cc	Y10  little-endian (0x20303159)
>  	%p4cc	NV12 big-endian (0xb231564e)
>  
> +Generic FourCC code
> +-------------------
> +
> +::
> +	%p4c[hrbl]	gP00 (0x67503030)
> +
> +Print a generic FourCC code, as both ASCII characters and its numerical
> +value as hexadecimal.
> +
> +The additional ``h``, ``r``, ``b``, and ``l`` specifiers are used to specify
> +host, reversed, big or little endian order data respectively. Host endian
> +order means the data is interpreted as a 32-bit integer and the most
> +significant byte is printed first; that is, the character code as printed
> +matches the byte order stored in memory on big-endian systems, and is reversed
> +on little-endian systems.

I though a bit more about the semantic and got a bit confused.
It might be because I am not familiar with FourCC. Anyway,
the description in the commit message provided some more clues.

The following documentation looks be more clear to me:

<proposal>
Generic FourCC code
-------------------

::
	%p4c[hrbl]	gP00 (0x67503030)

Print a generic FourCC code, as both ASCII characters and its numerical
value as hexadecimal.

The generic FourCC code is always printed in the the big-endian format,
the most significant byte first. This is the opposite of V4L/DRM
FOURCCs.

The additional ``h``, ``r``, ``b``, and ``l`` specifiers define what
endianes is used to load the stored value as 32-bit integer. The value
might be stored as host-endian, reverse-host-endian, big-endian,
or little endian.

Examples for a little-endian machine, host native load &(u32)0x67503030::

	%p4ch	gP00 (0x67503030)
	%p4cr	00Pg (0x30305067)
	%p4cb	00Pg (0x30305067)
	%p4cl	gP00 (0x67503030)

Examples for a big-endian machine, host native load &(u32)0x67503030::

	%p4ch	gP00 (0x67503030)
	%p4cr	00Pg (0x30305067)
	%p4cb	gP00 (0x67503030)
	%p4cl	00Pg (0x30305067)
</proposal>

Best Regards,
Petr
diff mbox series

Patch

diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index dbe1aacc79d0..92a488884cf8 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -625,6 +625,38 @@  Passed by reference.
 	%p4cc	Y10  little-endian (0x20303159)
 	%p4cc	NV12 big-endian (0xb231564e)
 
+Generic FourCC code
+-------------------
+
+::
+	%p4c[hrbl]	gP00 (0x67503030)
+
+Print a generic FourCC code, as both ASCII characters and its numerical
+value as hexadecimal.
+
+The additional ``h``, ``r``, ``b``, and ``l`` specifiers are used to specify
+host, reversed, big or little endian order data respectively. Host endian
+order means the data is interpreted as a 32-bit integer and the most
+significant byte is printed first; that is, the character code as printed
+matches the byte order stored in memory on big-endian systems, and is reversed
+on little-endian systems.
+
+Passed by reference.
+
+Examples for a little-endian machine, given &(u32)0x67503030::
+
+	%p4ch	gP00 (0x67503030)
+	%p4cr	00Pg (0x30305067)
+	%p4cb	00Pg (0x30305067)
+	%p4cl	gP00 (0x67503030)
+
+Examples for a big-endian machine, given &(u32)0x67503030::
+
+	%p4ch	gP00 (0x67503030)
+	%p4cr	00Pg (0x30305067)
+	%p4cb	gP00 (0x67503030)
+	%p4cl	00Pg (0x30305067)
+
 Rust
 ----
 
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 4bd15a593fbd..77a9128a6b5a 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -731,21 +731,46 @@  static void __init fwnode_pointer(void)
 	software_node_unregister_nodes(softnodes);
 }
 
+struct fourcc_struct {
+	u32 code;
+	const char *str;
+};
+
+static void __init fourcc_pointer_test(const struct fourcc_struct *fc, size_t n,
+				       const char *fmt)
+{
+	size_t i;
+
+	for (i = 0; i < n; i++)
+		test(fc[i].str, fmt, &fc[i].code);
+}
+
 static void __init fourcc_pointer(void)
 {
-	struct {
-		u32 code;
-		char *str;
-	} const try[] = {
+	struct fourcc_struct const try_cc[] = {
 		{ 0x3231564e, "NV12 little-endian (0x3231564e)", },
 		{ 0xb231564e, "NV12 big-endian (0xb231564e)", },
 		{ 0x10111213, ".... little-endian (0x10111213)", },
 		{ 0x20303159, "Y10  little-endian (0x20303159)", },
 	};
-	unsigned int i;
+	struct fourcc_struct const try_ch = {
+		0x41424344, "ABCD (0x41424344)",
+	};
+	struct fourcc_struct const try_cr = {
+		0x41424344, "DCBA (0x44434241)",
+	};
+	struct fourcc_struct const try_cl = {
+		le32_to_cpu(0x41424344), "ABCD (0x41424344)",
+	};
+	struct fourcc_struct const try_cb = {
+		be32_to_cpu(0x41424344), "ABCD (0x41424344)",
+	};
 
-	for (i = 0; i < ARRAY_SIZE(try); i++)
-		test(try[i].str, "%p4cc", &try[i].code);
+	fourcc_pointer_test(try_cc, ARRAY_SIZE(try_cc), "%p4cc");
+	fourcc_pointer_test(&try_ch, 1, "%p4ch");
+	fourcc_pointer_test(&try_cr, 1, "%p4cr");
+	fourcc_pointer_test(&try_cl, 1, "%p4cl");
+	fourcc_pointer_test(&try_cb, 1, "%p4cb");
 }
 
 static void __init
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 24f37bab8bc1..17064b839f19 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1762,27 +1762,50 @@  char *fourcc_string(char *buf, char *end, const u32 *fourcc,
 	char output[sizeof("0123 little-endian (0x01234567)")];
 	char *p = output;
 	unsigned int i;
+	bool pixel_fmt = false;
 	u32 orig, val;
 
-	if (fmt[1] != 'c' || fmt[2] != 'c')
+	if (fmt[1] != 'c')
 		return error_string(buf, end, "(%p4?)", spec);
 
 	if (check_pointer(&buf, end, fourcc, spec))
 		return buf;
 
 	orig = get_unaligned(fourcc);
-	val = orig & ~BIT(31);
+	switch (fmt[2]) {
+	case 'h':
+		val = orig;
+		break;
+	case 'r':
+		val = orig = swab32(orig);
+		break;
+	case 'l':
+		val = orig = le32_to_cpu(orig);
+		break;
+	case 'b':
+		val = orig = be32_to_cpu(orig);
+		break;
+	case 'c':
+		/* Pixel formats are printed LSB-first */
+		val = swab32(orig & ~BIT(31));
+		pixel_fmt = true;
+		break;
+	default:
+		return error_string(buf, end, "(%p4?)", spec);
+	}
 
 	for (i = 0; i < sizeof(u32); i++) {
-		unsigned char c = val >> (i * 8);
+		unsigned char c = val >> ((3 - i) * 8);
 
 		/* Print non-control ASCII characters as-is, dot otherwise */
 		*p++ = isascii(c) && isprint(c) ? c : '.';
 	}
 
-	*p++ = ' ';
-	strcpy(p, orig & BIT(31) ? "big-endian" : "little-endian");
-	p += strlen(p);
+	if (pixel_fmt) {
+		*p++ = ' ';
+		strcpy(p, orig & BIT(31) ? "big-endian" : "little-endian");
+		p += strlen(p);
+	}
 
 	*p++ = ' ';
 	*p++ = '(';