Message ID | 20241017035940.4067922-1-payamm@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | acpi: zero-initialize acpi_object union structure | expand |
On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com> wrote: > > The way in which acpi_object union is being initialized varies based on > compiler type, version and flags used. Some will zero-initialize the > entire union structure and some will only initialize the first N-bytes > of the union structure. Any details? > This could lead to uninitialized union members. So this is working around a compiler bug AFAICS. If the compiler has this bug, is it guaranteed to compile the rest of the kernel correctly? > This bug was confirmed by observing non-zero value for object->processor > structure variables. Where has it been observed? What compiler version(s)? etc. > non-zero initialized members of acpi_object union structure causes > incorrect error reporting by the driver. > > If a BIOS is using "Device" statement as opposed to "Processor" > statement, object variable may contain uninitialized members causing the > driver to report "Invalid PBLK length" incorrectly. > > Using memset to zero-initialize the union structure fixes this issue and > also removes the dependency of this function on compiler versions and > flags being used. > > Tested: Tested on ARM64 hardware that was printing this error and > confirmed the prints were gone. > > Also confirmed this does not cause regression on ARM64 and X86 > machines. > > Signed-off-by: Payam Moradshahi <payamm@google.com> > --- > drivers/acpi/acpi_processor.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > index 7cf6101cb4c73..6696ad4937d21 100644 > --- a/drivers/acpi/acpi_processor.c > +++ b/drivers/acpi/acpi_processor.c > @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr, > > static int acpi_processor_get_info(struct acpi_device *device) > { > - union acpi_object object = { 0 }; > + union acpi_object object; > struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; > struct acpi_processor *pr = acpi_driver_data(device); > int device_declaration = 0; > @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct acpi_device *device) > unsigned long long value; > int ret; > > + memset(&object, 0, sizeof(union acpi_object)); > + > acpi_processor_errata(); > > /* > --
Hi Rafael, Thank you for your response. Please see below for my response. Payam On Mon, Oct 21, 2024 at 01:31:39PM GMT, Rafael J. Wysocki wrote: > On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com> > wrote: > > > > The way in which acpi_object union is being initialized varies based on > > compiler type, version and flags used. Some will zero-initialize the > > entire union structure and some will only initialize the first N-bytes > > of the union structure. > Any details? I dumped acpi_object after initialization and noticed either the entire structure was zero-initialized or just the first 8 bytes depending on compiler type and version. gcc 13.2.0: bad CLANG_CL=362121269: good CLANG_CL=609443126: bad CLANG_CL=684773960: good > > This could lead to uninitialized union members. > So this is working around a compiler bug AFAICS. > If the compiler has this bug, is it guaranteed to compile the rest of > the kernel correctly? This is not a compiler bug. The way acpi_object union is being initialized is not c-spec compliant. Based on C Standard ISO/IEC 9899:202x (E): Section 6.2.6.1 (7) says: When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values Section 6.7.9 says: If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then: - if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits; The above guarantees zero only in the case of static or thread storage, which is not the case here. > > This bug was confirmed by observing non-zero value for object->processor > > structure variables. > Where has it been observed? What compiler version(s)? etc. See above for some examples > > non-zero initialized members of acpi_object union structure causes > > incorrect error reporting by the driver. > > > > If a BIOS is using "Device" statement as opposed to "Processor" > > statement, object variable may contain uninitialized members causing the > > driver to report "Invalid PBLK length" incorrectly. > > > > Using memset to zero-initialize the union structure fixes this issue and > > also removes the dependency of this function on compiler versions and > > flags being used. > > > > Tested: Tested on ARM64 hardware that was printing this error and > > confirmed the prints were gone. > > > > Also confirmed this does not cause regression on ARM64 and X86 > > machines. > > > > Signed-off-by: Payam Moradshahi <payamm@google.com> > > --- > > drivers/acpi/acpi_processor.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/acpi/acpi_processor.c > b/drivers/acpi/acpi_processor.c > > index 7cf6101cb4c73..6696ad4937d21 100644 > > --- a/drivers/acpi/acpi_processor.c > > +++ b/drivers/acpi/acpi_processor.c > > @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct > acpi_processor *pr, > > > > static int acpi_processor_get_info(struct acpi_device *device) > > { > > - union acpi_object object = { 0 }; > > + union acpi_object object; > > struct acpi_buffer buffer = { sizeof(union acpi_object), > &object }; > > struct acpi_processor *pr = acpi_driver_data(device); > > int device_declaration = 0; > > @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct > acpi_device *device) > > unsigned long long value; > > int ret; > > > > + memset(&object, 0, sizeof(union acpi_object)); > > + > > acpi_processor_errata(); > > > > /* > > --
Hi Rafael, On Thu, Oct 24, 2024 at 02:58:27PM GMT, Payam Moradshahi wrote: > Hi Rafael, > Thank you for your response. Please see below for my response. > Payam > On Mon, Oct 21, 2024 at 01:31:39PM GMT, Rafael J. Wysocki wrote: > > On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com> > > wrote: > > > > > > The way in which acpi_object union is being initialized varies based > on > > > compiler type, version and flags used. Some will zero-initialize the > > > entire union structure and some will only initialize the first N-bytes > > > of the union structure. > > Any details? > I dumped acpi_object after initialization and noticed either the > entire structure was zero-initialized or just the first 8 bytes > depending on compiler type and version. > gcc 13.2.0: bad > CLANG_CL=362121269: good > CLANG_CL=609443126: bad > CLANG_CL=684773960: good For reference the latter ones are Google internal builds of clang, but we've played around in godbolt with this in a variety of versions and some work and some don't. Payam has the relevant section from the C standard below. > > > This could lead to uninitialized union members. > > So this is working around a compiler bug AFAICS. > > If the compiler has this bug, is it guaranteed to compile the rest of > > the kernel correctly? > This is not a compiler bug. The way acpi_object union is being > initialized is not c-spec compliant. I think in past versions we might've gotten lucky with this, recent compilers might've tightened this up some or changed behavior. > Based on C Standard ISO/IEC 9899:202x (E): > Section 6.2.6.1 (7) says: > When a value is stored in a member of an object of union type, the bytes > of > the > object representation that do not correspond to that member but do > correspond > to other members take unspecified values > Section 6.7.9 says: > If an object that has automatic storage duration is not initialized > explicitly, > its value is indeterminate. > If an object that has static or thread storage duration is not initialized > explicitly, then: > - if it is a union, the first named member is initialized (recursively) > according to these rules, and any padding is initialized to zero bits; > The above guarantees zero only in the case of static or thread storage, > which is not the case here. > > > This bug was confirmed by observing non-zero value for > object->processor > > > structure variables. > > Where has it been observed? What compiler version(s)? etc. > See above for some examples This manifests like this on Google Axion arm64 builds using Arm's prebuilt GCC 13.2.0: [ 1.844508] acpi ACPI0007:08: Invalid PBLK length [271170112] [ 1.850253] acpi ACPI0007:09: Invalid PBLK length [271170112] [ 1.855992] acpi ACPI0007:0a: Invalid PBLK length [271170112] [ 1.861730] acpi ACPI0007:0b: Invalid PBLK length [271170112] [ 1.867470] acpi ACPI0007:0c: Invalid PBLK length [271170112] [ 1.873208] acpi ACPI0007:0d: Invalid PBLK length [271170112] [ 1.878947] acpi ACPI0007:0e: Invalid PBLK length [271170112] [ 1.884686] acpi ACPI0007:0f: Invalid PBLK length [271170112] [ 1.890424] acpi ACPI0007:10: Invalid PBLK length [271170112] [ 1.896161] acpi ACPI0007:11: Invalid PBLK length [271170112] [ 1.901898] acpi ACPI0007:12: Invalid PBLK length [271170112] [ 1.907636] acpi ACPI0007:13: Invalid PBLK length [271170112] [ 1.913374] acpi ACPI0007:14: Invalid PBLK length [271170112] [ 1.919113] acpi ACPI0007:15: Invalid PBLK length [271170112] [ 1.924851] acpi ACPI0007:16: Invalid PBLK length [271170112] > > > non-zero initialized members of acpi_object union structure causes > > > incorrect error reporting by the driver. > > > > > > If a BIOS is using "Device" statement as opposed to "Processor" > > > statement, object variable may contain uninitialized members causing > the > > > driver to report "Invalid PBLK length" incorrectly. > > > > > > Using memset to zero-initialize the union structure fixes this issue > and > > > also removes the dependency of this function on compiler versions and > > > flags being used. > > > > > > Tested: Tested on ARM64 hardware that was printing this error and > > > confirmed the prints were gone. > > > > > > Also confirmed this does not cause regression on ARM64 and X86 > > > machines. > > > > > > Signed-off-by: Payam Moradshahi <payamm@google.com> > > > --- > > > drivers/acpi/acpi_processor.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/acpi/acpi_processor.c > > b/drivers/acpi/acpi_processor.c > > > index 7cf6101cb4c73..6696ad4937d21 100644 > > > --- a/drivers/acpi/acpi_processor.c > > > +++ b/drivers/acpi/acpi_processor.c > > > @@ -275,7 +275,7 @@ static inline int > > acpi_processor_hotadd_init(struct acpi_processor *pr, > > > > > > static int acpi_processor_get_info(struct acpi_device *device) > > > { > > > - union acpi_object object = { 0 }; > > > + union acpi_object object; > > > struct acpi_buffer buffer = { sizeof(union acpi_object), > > &object }; > > > struct acpi_processor *pr = acpi_driver_data(device); > > > int device_declaration = 0; > > > @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct > > acpi_device *device) > > > unsigned long long value; > > > int ret; > > > > > > + memset(&object, 0, sizeof(union acpi_object)); > > > + > > > acpi_processor_errata(); > > > > > > /* > > > -- Payam, it might be good to add a log snippet and references to the relevant spec sections to your v2 commit message. Moreover, for v2 of the patch you probably want to add a Cc: stable@vger.kernel.org to your patch to make sure it gets picked up once it gets picked up for mainline. See [1] for more info on the stable process. Cheers, Moritz [1] https://www.kernel.org/doc/Documentation/process/stable-kernel-rules.rst
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 7cf6101cb4c73..6696ad4937d21 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr, static int acpi_processor_get_info(struct acpi_device *device) { - union acpi_object object = { 0 }; + union acpi_object object; struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; struct acpi_processor *pr = acpi_driver_data(device); int device_declaration = 0; @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct acpi_device *device) unsigned long long value; int ret; + memset(&object, 0, sizeof(union acpi_object)); + acpi_processor_errata(); /*
The way in which acpi_object union is being initialized varies based on compiler type, version and flags used. Some will zero-initialize the entire union structure and some will only initialize the first N-bytes of the union structure. This could lead to uninitialized union members. This bug was confirmed by observing non-zero value for object->processor structure variables. non-zero initialized members of acpi_object union structure causes incorrect error reporting by the driver. If a BIOS is using "Device" statement as opposed to "Processor" statement, object variable may contain uninitialized members causing the driver to report "Invalid PBLK length" incorrectly. Using memset to zero-initialize the union structure fixes this issue and also removes the dependency of this function on compiler versions and flags being used. Tested: Tested on ARM64 hardware that was printing this error and confirmed the prints were gone. Also confirmed this does not cause regression on ARM64 and X86 machines. Signed-off-by: Payam Moradshahi <payamm@google.com> --- drivers/acpi/acpi_processor.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)