Message ID | 20220123091004.763775-1-ztong0001@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/privacy-screen: honor acpi=off in detect_thinkpad_privacy_screen | expand |
Hi All, On 1/23/22 10:10, Tong Zhang wrote: > when acpi=off is provided in bootarg, kernel crash with > > [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 > [ 1.258308] Call Trace: > [ 1.258490] ? acpi_walk_namespace+0x147/0x147 > [ 1.258770] acpi_get_devices+0xe4/0x137 > [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] > [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] > [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] > > The reason is that acpi_walk_namespace expects acpi related stuff > initialized but in fact it wouldn't when acpi is set to off. In this case > we should honor acpi=off in detect_thinkpad_privacy_screen(). > > Signed-off-by: Tong Zhang <ztong0001@gmail.com> Thank you for catching this and thank you for your patch. I was about to merge this, but then I realized that this might not be the best way to fix this. A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, and at a first glance about half of those are missing an acpi_disabled check. IMHO it would be better to simply add an acpi_disabled check to acpi_get_devices() itself. Rafael, do you agree ? Note the just added chrome privacy-screen check uses acpi_dev_present(), this is also used in about 10 places outside of drivers/acpi and AFAIK none of those do an acpi_disabled check. acpi_dev_present() uses bus_find_device(&acpi_bus_type, ...) but the acpi_bus_type does not get registered when acpi_disabled is set. In the end this is fine though since bus_find_device checks for the bus not being registered and then just returns NULL. Regards, Hans > --- > v2: fix typo in previous commit -- my keyboard is eating letters > > drivers/gpu/drm/drm_privacy_screen_x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/drm_privacy_screen_x86.c b/drivers/gpu/drm/drm_privacy_screen_x86.c > index a2cafb294ca6..e7aa74ad0b24 100644 > --- a/drivers/gpu/drm/drm_privacy_screen_x86.c > +++ b/drivers/gpu/drm/drm_privacy_screen_x86.c > @@ -33,6 +33,9 @@ static bool __init detect_thinkpad_privacy_screen(void) > unsigned long long output; > acpi_status status; > > + if (acpi_disabled) > + return false; > + > /* Get embedded-controller handle */ > status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle); > if (ACPI_FAILURE(status) || !ec_handle)
Hi, On 1/26/22 14:47, Hans de Goede wrote: > Hi All, > > On 1/23/22 10:10, Tong Zhang wrote: >> when acpi=off is provided in bootarg, kernel crash with >> >> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 >> [ 1.258308] Call Trace: >> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 >> [ 1.258770] acpi_get_devices+0xe4/0x137 >> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] >> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] >> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] >> >> The reason is that acpi_walk_namespace expects acpi related stuff >> initialized but in fact it wouldn't when acpi is set to off. In this case >> we should honor acpi=off in detect_thinkpad_privacy_screen(). >> >> Signed-off-by: Tong Zhang <ztong0001@gmail.com> > > Thank you for catching this and thank you for your patch. I was about to merge > this, but then I realized that this might not be the best way to fix this. > > A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, > and at a first glance about half of those are missing an acpi_disabled > check. IMHO it would be better to simply add an acpi_disabled check to > acpi_get_devices() itself. > > Rafael, do you agree ? Never mind I just saw that acpi_get_devices() is part of acpica, where as the acpi_disabled flag is not. So callers need to check acpi_disabled before calling acpi_get_devices(). I'll go and push this patch to drm-misc-fixes now. Regards, Hans >> --- >> v2: fix typo in previous commit -- my keyboard is eating letters >> >> drivers/gpu/drm/drm_privacy_screen_x86.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/gpu/drm/drm_privacy_screen_x86.c b/drivers/gpu/drm/drm_privacy_screen_x86.c >> index a2cafb294ca6..e7aa74ad0b24 100644 >> --- a/drivers/gpu/drm/drm_privacy_screen_x86.c >> +++ b/drivers/gpu/drm/drm_privacy_screen_x86.c >> @@ -33,6 +33,9 @@ static bool __init detect_thinkpad_privacy_screen(void) >> unsigned long long output; >> acpi_status status; >> >> + if (acpi_disabled) >> + return false; >> + >> /* Get embedded-controller handle */ >> status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle); >> if (ACPI_FAILURE(status) || !ec_handle)
On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi All, > > On 1/23/22 10:10, Tong Zhang wrote: > > when acpi=off is provided in bootarg, kernel crash with > > > > [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 > > [ 1.258308] Call Trace: > > [ 1.258490] ? acpi_walk_namespace+0x147/0x147 > > [ 1.258770] acpi_get_devices+0xe4/0x137 > > [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] > > [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] > > [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] > > > > The reason is that acpi_walk_namespace expects acpi related stuff > > initialized but in fact it wouldn't when acpi is set to off. In this case > > we should honor acpi=off in detect_thinkpad_privacy_screen(). > > > > Signed-off-by: Tong Zhang <ztong0001@gmail.com> > > Thank you for catching this and thank you for your patch. I was about to merge > this, but then I realized that this might not be the best way to fix this. > > A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, > and at a first glance about half of those are missing an acpi_disabled > check. IMHO it would be better to simply add an acpi_disabled check to > acpi_get_devices() itself. > > Rafael, do you agree ? Yes, I do. > Note the just added chrome privacy-screen check uses > acpi_dev_present(), this is also used in about 10 places outside > of drivers/acpi and AFAIK none of those do an acpi_disabled check. > > acpi_dev_present() uses bus_find_device(&acpi_bus_type, ...) > but the acpi_bus_type does not get registered when acpi_disabled > is set. In the end this is fine though since bus_find_device > checks for the bus not being registered and then just returns > NULL. Right. > > --- > > v2: fix typo in previous commit -- my keyboard is eating letters > > > > drivers/gpu/drm/drm_privacy_screen_x86.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_privacy_screen_x86.c b/drivers/gpu/drm/drm_privacy_screen_x86.c > > index a2cafb294ca6..e7aa74ad0b24 100644 > > --- a/drivers/gpu/drm/drm_privacy_screen_x86.c > > +++ b/drivers/gpu/drm/drm_privacy_screen_x86.c > > @@ -33,6 +33,9 @@ static bool __init detect_thinkpad_privacy_screen(void) > > unsigned long long output; > > acpi_status status; > > > > + if (acpi_disabled) > > + return false; > > + > > /* Get embedded-controller handle */ > > status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle); > > if (ACPI_FAILURE(status) || !ec_handle) >
Hi, On 1/26/22 16:54, Rafael J. Wysocki wrote: > On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi All, >> >> On 1/23/22 10:10, Tong Zhang wrote: >>> when acpi=off is provided in bootarg, kernel crash with >>> >>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 >>> [ 1.258308] Call Trace: >>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 >>> [ 1.258770] acpi_get_devices+0xe4/0x137 >>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] >>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] >>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] >>> >>> The reason is that acpi_walk_namespace expects acpi related stuff >>> initialized but in fact it wouldn't when acpi is set to off. In this case >>> we should honor acpi=off in detect_thinkpad_privacy_screen(). >>> >>> Signed-off-by: Tong Zhang <ztong0001@gmail.com> >> >> Thank you for catching this and thank you for your patch. I was about to merge >> this, but then I realized that this might not be the best way to fix this. >> >> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, >> and at a first glance about half of those are missing an acpi_disabled >> check. IMHO it would be better to simply add an acpi_disabled check to >> acpi_get_devices() itself. >> >> Rafael, do you agree ? > > Yes, I do. Did you see my follow-up that that is not going to work because acpi_get_devices() is an acpica function ? Regards, Hans >>> --- >>> v2: fix typo in previous commit -- my keyboard is eating letters >>> >>> drivers/gpu/drm/drm_privacy_screen_x86.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/drm_privacy_screen_x86.c b/drivers/gpu/drm/drm_privacy_screen_x86.c >>> index a2cafb294ca6..e7aa74ad0b24 100644 >>> --- a/drivers/gpu/drm/drm_privacy_screen_x86.c >>> +++ b/drivers/gpu/drm/drm_privacy_screen_x86.c >>> @@ -33,6 +33,9 @@ static bool __init detect_thinkpad_privacy_screen(void) >>> unsigned long long output; >>> acpi_status status; >>> >>> + if (acpi_disabled) >>> + return false; >>> + >>> /* Get embedded-controller handle */ >>> status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle); >>> if (ACPI_FAILURE(status) || !ec_handle) >> >
On Wed, Jan 26, 2022 at 5:41 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi, > > On 1/26/22 16:54, Rafael J. Wysocki wrote: > > On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: > >> > >> Hi All, > >> > >> On 1/23/22 10:10, Tong Zhang wrote: > >>> when acpi=off is provided in bootarg, kernel crash with > >>> > >>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 > >>> [ 1.258308] Call Trace: > >>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 > >>> [ 1.258770] acpi_get_devices+0xe4/0x137 > >>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] > >>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] > >>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] > >>> > >>> The reason is that acpi_walk_namespace expects acpi related stuff > >>> initialized but in fact it wouldn't when acpi is set to off. In this case > >>> we should honor acpi=off in detect_thinkpad_privacy_screen(). > >>> > >>> Signed-off-by: Tong Zhang <ztong0001@gmail.com> > >> > >> Thank you for catching this and thank you for your patch. I was about to merge > >> this, but then I realized that this might not be the best way to fix this. > >> > >> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, > >> and at a first glance about half of those are missing an acpi_disabled > >> check. IMHO it would be better to simply add an acpi_disabled check to > >> acpi_get_devices() itself. > >> > >> Rafael, do you agree ? > > > > Yes, I do. > > Did you see my follow-up that that is not going to work because > acpi_get_devices() is an acpica function ? No, I didn't, but it is possible to add a wrapper doing the check around it and convert all of the users. Alternatively, the ACPICA function can check acpi_gbl_root_node against NULL, like in the attached (untested) patch.
Hi, On 1/26/22 18:11, Rafael J. Wysocki wrote: > On Wed, Jan 26, 2022 at 5:41 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi, >> >> On 1/26/22 16:54, Rafael J. Wysocki wrote: >>> On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: >>>> >>>> Hi All, >>>> >>>> On 1/23/22 10:10, Tong Zhang wrote: >>>>> when acpi=off is provided in bootarg, kernel crash with >>>>> >>>>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 >>>>> [ 1.258308] Call Trace: >>>>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 >>>>> [ 1.258770] acpi_get_devices+0xe4/0x137 >>>>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] >>>>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] >>>>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] >>>>> >>>>> The reason is that acpi_walk_namespace expects acpi related stuff >>>>> initialized but in fact it wouldn't when acpi is set to off. In this case >>>>> we should honor acpi=off in detect_thinkpad_privacy_screen(). >>>>> >>>>> Signed-off-by: Tong Zhang <ztong0001@gmail.com> >>>> >>>> Thank you for catching this and thank you for your patch. I was about to merge >>>> this, but then I realized that this might not be the best way to fix this. >>>> >>>> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, >>>> and at a first glance about half of those are missing an acpi_disabled >>>> check. IMHO it would be better to simply add an acpi_disabled check to >>>> acpi_get_devices() itself. >>>> >>>> Rafael, do you agree ? >>> >>> Yes, I do. >> >> Did you see my follow-up that that is not going to work because >> acpi_get_devices() is an acpica function ? > > No, I didn't, but it is possible to add a wrapper doing the check > around it and convert all of the users. Yes I did think about that. Note that I've gone ahead and pushed the fix which started this to drm-misc-fixes, to resolve the crash for now. If we add such a wrapper we can remove a bunch of acpi_disabled checks from various callers. > Alternatively, the ACPICA function can check acpi_gbl_root_node > against NULL, like in the attached (untested) patch. That is probably an even better idea, as that avoids the need for a wrapper altogether. So I believe that that is the best solution. Regards, Hans
On Thu, Jan 27, 2022 at 2:05 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi, > > On 1/26/22 18:11, Rafael J. Wysocki wrote: > > On Wed, Jan 26, 2022 at 5:41 PM Hans de Goede <hdegoede@redhat.com> wrote: > >> > >> Hi, > >> > >> On 1/26/22 16:54, Rafael J. Wysocki wrote: > >>> On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: > >>>> > >>>> Hi All, > >>>> > >>>> On 1/23/22 10:10, Tong Zhang wrote: > >>>>> when acpi=off is provided in bootarg, kernel crash with > >>>>> > >>>>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 > >>>>> [ 1.258308] Call Trace: > >>>>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 > >>>>> [ 1.258770] acpi_get_devices+0xe4/0x137 > >>>>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] > >>>>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] > >>>>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] > >>>>> > >>>>> The reason is that acpi_walk_namespace expects acpi related stuff > >>>>> initialized but in fact it wouldn't when acpi is set to off. In this case > >>>>> we should honor acpi=off in detect_thinkpad_privacy_screen(). > >>>>> > >>>>> Signed-off-by: Tong Zhang <ztong0001@gmail.com> > >>>> > >>>> Thank you for catching this and thank you for your patch. I was about to merge > >>>> this, but then I realized that this might not be the best way to fix this. > >>>> > >>>> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, > >>>> and at a first glance about half of those are missing an acpi_disabled > >>>> check. IMHO it would be better to simply add an acpi_disabled check to > >>>> acpi_get_devices() itself. > >>>> > >>>> Rafael, do you agree ? > >>> > >>> Yes, I do. > >> > >> Did you see my follow-up that that is not going to work because > >> acpi_get_devices() is an acpica function ? > > > > No, I didn't, but it is possible to add a wrapper doing the check > > around it and convert all of the users. > > Yes I did think about that. Note that I've gone ahead and pushed > the fix which started this to drm-misc-fixes, to resolve the crash > for now. OK > If we add such a wrapper we can remove a bunch of acpi_disabled checks > from various callers. > > > Alternatively, the ACPICA function can check acpi_gbl_root_node > > against NULL, like in the attached (untested) patch. > > That is probably an even better idea, as that avoids the need > for a wrapper altogether. So I believe that that is the best > solution. Allright, let me cut an analogous patch for the upstream ACPICA, then.
Hi, On 1/27/22 14:33, Rafael J. Wysocki wrote: > On Thu, Jan 27, 2022 at 2:05 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi, >> >> On 1/26/22 18:11, Rafael J. Wysocki wrote: >>> On Wed, Jan 26, 2022 at 5:41 PM Hans de Goede <hdegoede@redhat.com> wrote: >>>> >>>> Hi, >>>> >>>> On 1/26/22 16:54, Rafael J. Wysocki wrote: >>>>> On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@redhat.com> wrote: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> On 1/23/22 10:10, Tong Zhang wrote: >>>>>>> when acpi=off is provided in bootarg, kernel crash with >>>>>>> >>>>>>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 >>>>>>> [ 1.258308] Call Trace: >>>>>>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147 >>>>>>> [ 1.258770] acpi_get_devices+0xe4/0x137 >>>>>>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] >>>>>>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] >>>>>>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] >>>>>>> >>>>>>> The reason is that acpi_walk_namespace expects acpi related stuff >>>>>>> initialized but in fact it wouldn't when acpi is set to off. In this case >>>>>>> we should honor acpi=off in detect_thinkpad_privacy_screen(). >>>>>>> >>>>>>> Signed-off-by: Tong Zhang <ztong0001@gmail.com> >>>>>> >>>>>> Thank you for catching this and thank you for your patch. I was about to merge >>>>>> this, but then I realized that this might not be the best way to fix this. >>>>>> >>>>>> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi, >>>>>> and at a first glance about half of those are missing an acpi_disabled >>>>>> check. IMHO it would be better to simply add an acpi_disabled check to >>>>>> acpi_get_devices() itself. >>>>>> >>>>>> Rafael, do you agree ? >>>>> >>>>> Yes, I do. >>>> >>>> Did you see my follow-up that that is not going to work because >>>> acpi_get_devices() is an acpica function ? >>> >>> No, I didn't, but it is possible to add a wrapper doing the check >>> around it and convert all of the users. >> >> Yes I did think about that. Note that I've gone ahead and pushed >> the fix which started this to drm-misc-fixes, to resolve the crash >> for now. > > OK > >> If we add such a wrapper we can remove a bunch of acpi_disabled checks >> from various callers. >> >>> Alternatively, the ACPICA function can check acpi_gbl_root_node >>> against NULL, like in the attached (untested) patch. >> >> That is probably an even better idea, as that avoids the need >> for a wrapper altogether. So I believe that that is the best >> solution. > > Allright, let me cut an analogous patch for the upstream ACPICA, then. Great, thank you. I have added a note about checking for when this has found its way into Linus' tree to my own TODO list, with the goal of doing a cleanup series removing the then no longer needed acpi_disabled checks in a bunch of places. Regards, Hans
diff --git a/drivers/gpu/drm/drm_privacy_screen_x86.c b/drivers/gpu/drm/drm_privacy_screen_x86.c index a2cafb294ca6..e7aa74ad0b24 100644 --- a/drivers/gpu/drm/drm_privacy_screen_x86.c +++ b/drivers/gpu/drm/drm_privacy_screen_x86.c @@ -33,6 +33,9 @@ static bool __init detect_thinkpad_privacy_screen(void) unsigned long long output; acpi_status status; + if (acpi_disabled) + return false; + /* Get embedded-controller handle */ status = acpi_get_devices("PNP0C09", acpi_set_handle, NULL, &ec_handle); if (ACPI_FAILURE(status) || !ec_handle)
when acpi=off is provided in bootarg, kernel crash with [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 1.258308] Call Trace: [ 1.258490] ? acpi_walk_namespace+0x147/0x147 [ 1.258770] acpi_get_devices+0xe4/0x137 [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm] [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm] [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm] The reason is that acpi_walk_namespace expects acpi related stuff initialized but in fact it wouldn't when acpi is set to off. In this case we should honor acpi=off in detect_thinkpad_privacy_screen(). Signed-off-by: Tong Zhang <ztong0001@gmail.com> --- v2: fix typo in previous commit -- my keyboard is eating letters drivers/gpu/drm/drm_privacy_screen_x86.c | 3 +++ 1 file changed, 3 insertions(+)