Message ID | 20231128-kms-hdmi-connector-state-v4-5-c7602158306e@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/connector: Create HDMI Connector infrastructure | expand |
On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > All the drm_connector_init variants take at least a pointer to the > device, connector and hooks implementation. > > However, none of them check their value before dereferencing those > pointers which can lead to a NULL-pointer dereference if the author > isn't careful. Arguably oopsing on the spot is preferrable when this can't be caused by user input. It's always a mistake that should be caught early during development. Not everyone checks the return value of drm_connector_init and friends, so those cases will lead to more mysterious bugs later. And probably oopses as well. BR, Jani. > > Let's test those pointers instead and error out if any is NULL. > > Signed-off-by: Maxime Ripard <mripard@kernel.org> > --- > drivers/gpu/drm/drm_connector.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c > index b0516505f7ae..2f60755dccdd 100644 > --- a/drivers/gpu/drm/drm_connector.c > +++ b/drivers/gpu/drm/drm_connector.c > @@ -229,6 +229,9 @@ static int __drm_connector_init(struct drm_device *dev, > struct ida *connector_ida = > &drm_connector_enum_list[connector_type].ida; > > + if (!dev || !connector || !funcs) > + return -EINVAL; > + > WARN_ON(drm_drv_uses_atomic_modeset(dev) && > (!funcs->atomic_destroy_state || > !funcs->atomic_duplicate_state));
Hi Jani, On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > All the drm_connector_init variants take at least a pointer to the > > device, connector and hooks implementation. > > > > However, none of them check their value before dereferencing those > > pointers which can lead to a NULL-pointer dereference if the author > > isn't careful. > > Arguably oopsing on the spot is preferrable when this can't be caused by > user input. It's always a mistake that should be caught early during > development. > > Not everyone checks the return value of drm_connector_init and friends, > so those cases will lead to more mysterious bugs later. And probably > oopses as well. So maybe we can do both then, with something like if (WARN_ON(!dev)) return -EINVAL if (drm_WARN_ON(dev, !connector || !funcs)) return -EINVAL; I'd still like to check for this, so we can have proper testing, and we already check for those pointers in some places (like funcs in drm_connector_init), so if we don't cover everything we're inconsistent. Maxime
On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > Hi Jani, > > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > > All the drm_connector_init variants take at least a pointer to the > > > device, connector and hooks implementation. > > > > > > However, none of them check their value before dereferencing those > > > pointers which can lead to a NULL-pointer dereference if the author > > > isn't careful. > > > > Arguably oopsing on the spot is preferrable when this can't be caused by > > user input. It's always a mistake that should be caught early during > > development. > > > > Not everyone checks the return value of drm_connector_init and friends, > > so those cases will lead to more mysterious bugs later. And probably > > oopses as well. > > So maybe we can do both then, with something like > > if (WARN_ON(!dev)) > return -EINVAL > > if (drm_WARN_ON(dev, !connector || !funcs)) > return -EINVAL; > > I'd still like to check for this, so we can have proper testing, and we > already check for those pointers in some places (like funcs in > drm_connector_init), so if we don't cover everything we're inconsistent. People will invariably cargo-cult this kind of stuff absolutely everywhere and then all your functions will have tons of dead code to check their arguments. I'd prefer not to go there usually. Should we perhaps start to use the (arguably hideous) - void f(struct foo *bar) + void f(struct foo bar[static 1]) syntax to tell the compiler we don't accept NULL pointers? Hmm. Apparently that has the same problem as using any other kind of array syntax in the prototype. That is, the compiler demands to know the definition of 'struct foo' even though we're passing in effectively a pointer. Sigh.
Hi Ville, On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > > > All the drm_connector_init variants take at least a pointer to the > > > > device, connector and hooks implementation. > > > > > > > > However, none of them check their value before dereferencing those > > > > pointers which can lead to a NULL-pointer dereference if the author > > > > isn't careful. > > > > > > Arguably oopsing on the spot is preferrable when this can't be caused by > > > user input. It's always a mistake that should be caught early during > > > development. > > > > > > Not everyone checks the return value of drm_connector_init and friends, > > > so those cases will lead to more mysterious bugs later. And probably > > > oopses as well. > > > > So maybe we can do both then, with something like > > > > if (WARN_ON(!dev)) > > return -EINVAL > > > > if (drm_WARN_ON(dev, !connector || !funcs)) > > return -EINVAL; > > > > I'd still like to check for this, so we can have proper testing, and we > > already check for those pointers in some places (like funcs in > > drm_connector_init), so if we don't cover everything we're inconsistent. > > People will invariably cargo-cult this kind of stuff absolutely > everywhere and then all your functions will have tons of dead > code to check their arguments. And that's a bad thing because... ? Also, are you really saying that checking that your arguments make sense is cargo-cult? We're already doing it in some parts of KMS, so we have to be consistent, and the answer to "most drivers don't check the error" cannot be "let's just give on error checking then". > I'd prefer not to go there usually. > > Should we perhaps start to use the (arguably hideous) > - void f(struct foo *bar) > + void f(struct foo bar[static 1]) > syntax to tell the compiler we don't accept NULL pointers? > > Hmm. Apparently that has the same problem as using any > other kind of array syntax in the prototype. That is, > the compiler demands to know the definition of 'struct foo' > even though we're passing in effectively a pointer. Sigh. Honestly, I don't care as long as it's something we can unit-test to make sure we make it consistent. We can't unit test a complete kernel crash.
On Wed, Nov 29, 2023 at 10:11:26AM +0100, Maxime Ripard wrote: > Hi Ville, > > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > > On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > > > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > > > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > > > > All the drm_connector_init variants take at least a pointer to the > > > > > device, connector and hooks implementation. > > > > > > > > > > However, none of them check their value before dereferencing those > > > > > pointers which can lead to a NULL-pointer dereference if the author > > > > > isn't careful. > > > > > > > > Arguably oopsing on the spot is preferrable when this can't be caused by > > > > user input. It's always a mistake that should be caught early during > > > > development. > > > > > > > > Not everyone checks the return value of drm_connector_init and friends, > > > > so those cases will lead to more mysterious bugs later. And probably > > > > oopses as well. > > > > > > So maybe we can do both then, with something like > > > > > > if (WARN_ON(!dev)) > > > return -EINVAL > > > > > > if (drm_WARN_ON(dev, !connector || !funcs)) > > > return -EINVAL; > > > > > > I'd still like to check for this, so we can have proper testing, and we > > > already check for those pointers in some places (like funcs in > > > drm_connector_init), so if we don't cover everything we're inconsistent. > > > > People will invariably cargo-cult this kind of stuff absolutely > > everywhere and then all your functions will have tons of dead > > code to check their arguments. > > And that's a bad thing because... ? > > Also, are you really saying that checking that your arguments make sense > is cargo-cult? > > We're already doing it in some parts of KMS, so we have to be > consistent, and the answer to "most drivers don't check the error" > cannot be "let's just give on error checking then". > > > I'd prefer not to go there usually. > > > > Should we perhaps start to use the (arguably hideous) > > - void f(struct foo *bar) > > + void f(struct foo bar[static 1]) > > syntax to tell the compiler we don't accept NULL pointers? > > > > Hmm. Apparently that has the same problem as using any > > other kind of array syntax in the prototype. That is, > > the compiler demands to know the definition of 'struct foo' > > even though we're passing in effectively a pointer. Sigh. > > Honestly, I don't care as long as it's something we can unit-test to > make sure we make it consistent. We can't unit test a complete kernel > crash. Why do you want to put utterly broken code into a unit test?
On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > Hi Ville, > > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: >> On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: >> > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: >> > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: >> > > > All the drm_connector_init variants take at least a pointer to the >> > > > device, connector and hooks implementation. >> > > > >> > > > However, none of them check their value before dereferencing those >> > > > pointers which can lead to a NULL-pointer dereference if the author >> > > > isn't careful. >> > > >> > > Arguably oopsing on the spot is preferrable when this can't be caused by >> > > user input. It's always a mistake that should be caught early during >> > > development. >> > > >> > > Not everyone checks the return value of drm_connector_init and friends, >> > > so those cases will lead to more mysterious bugs later. And probably >> > > oopses as well. >> > >> > So maybe we can do both then, with something like >> > >> > if (WARN_ON(!dev)) >> > return -EINVAL >> > >> > if (drm_WARN_ON(dev, !connector || !funcs)) >> > return -EINVAL; >> > >> > I'd still like to check for this, so we can have proper testing, and we >> > already check for those pointers in some places (like funcs in >> > drm_connector_init), so if we don't cover everything we're inconsistent. >> >> People will invariably cargo-cult this kind of stuff absolutely >> everywhere and then all your functions will have tons of dead >> code to check their arguments. > > And that's a bad thing because... ? > > Also, are you really saying that checking that your arguments make sense > is cargo-cult? It's a powerful thing to be able to assume a NULL argument is always a fatal programming error on the caller's side, and should oops and get caught immediately. It's an assertion. We're not talking about user input or anything like that here. If you start checking for things that can't happen, and return errors for them, you start gracefully handling things that don't have anything graceful about them. Having such checks in place trains people to think they *may* happen. While it should fail fast and loud at the developer's first smoke test, and get fixed then and there. BR, Jani. > > We're already doing it in some parts of KMS, so we have to be > consistent, and the answer to "most drivers don't check the error" > cannot be "let's just give on error checking then". > >> I'd prefer not to go there usually. >> >> Should we perhaps start to use the (arguably hideous) >> - void f(struct foo *bar) >> + void f(struct foo bar[static 1]) >> syntax to tell the compiler we don't accept NULL pointers? >> >> Hmm. Apparently that has the same problem as using any >> other kind of array syntax in the prototype. That is, >> the compiler demands to know the definition of 'struct foo' >> even though we're passing in effectively a pointer. Sigh. > > Honestly, I don't care as long as it's something we can unit-test to > make sure we make it consistent. We can't unit test a complete kernel > crash.
On Tue, 28 Nov 2023 15:49:08 +0200 Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > Should we perhaps start to use the (arguably hideous) > - void f(struct foo *bar) > + void f(struct foo bar[static 1]) > syntax to tell the compiler we don't accept NULL pointers? > > Hmm. Apparently that has the same problem as using any > other kind of array syntax in the prototype. That is, > the compiler demands to know the definition of 'struct foo' > even though we're passing in effectively a pointer. Sigh. __attribute__((nonnull)) ? Thanks, pq
On Wed, Nov 29, 2023 at 12:12:59PM +0200, Pekka Paalanen wrote: > On Tue, 28 Nov 2023 15:49:08 +0200 > Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > Should we perhaps start to use the (arguably hideous) > > - void f(struct foo *bar) > > + void f(struct foo bar[static 1]) > > syntax to tell the compiler we don't accept NULL pointers? > > > > Hmm. Apparently that has the same problem as using any > > other kind of array syntax in the prototype. That is, > > the compiler demands to know the definition of 'struct foo' > > even though we're passing in effectively a pointer. Sigh. > > > __attribute__((nonnull)) ? I guess that would work, though the syntax is horrible when you need to flag specific arguments.
On Wed, Nov 29, 2023 at 11:38:42AM +0200, Jani Nikula wrote: > On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > Hi Ville, > > > > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > >> On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > >> > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > >> > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > >> > > > All the drm_connector_init variants take at least a pointer to the > >> > > > device, connector and hooks implementation. > >> > > > > >> > > > However, none of them check their value before dereferencing those > >> > > > pointers which can lead to a NULL-pointer dereference if the author > >> > > > isn't careful. > >> > > > >> > > Arguably oopsing on the spot is preferrable when this can't be caused by > >> > > user input. It's always a mistake that should be caught early during > >> > > development. > >> > > > >> > > Not everyone checks the return value of drm_connector_init and friends, > >> > > so those cases will lead to more mysterious bugs later. And probably > >> > > oopses as well. > >> > > >> > So maybe we can do both then, with something like > >> > > >> > if (WARN_ON(!dev)) > >> > return -EINVAL > >> > > >> > if (drm_WARN_ON(dev, !connector || !funcs)) > >> > return -EINVAL; > >> > > >> > I'd still like to check for this, so we can have proper testing, and we > >> > already check for those pointers in some places (like funcs in > >> > drm_connector_init), so if we don't cover everything we're inconsistent. > >> > >> People will invariably cargo-cult this kind of stuff absolutely > >> everywhere and then all your functions will have tons of dead > >> code to check their arguments. > > > > And that's a bad thing because... ? > > > > Also, are you really saying that checking that your arguments make sense > > is cargo-cult? > > It's a powerful thing to be able to assume a NULL argument is always a > fatal programming error on the caller's side, and should oops and get > caught immediately. It's an assertion. Yeah, but we're not really doing that either. We have no explicit assertion anywhere. We take a pointer in, and just hope that it will be dereferenced later on and that the kernel will crash. The pointer to the functions especially is only deferenced very later on. And assertions might be powerful, but being able to notice errors and debug them is too. A panic takes away basically any remote access to debug. If you don't have a console, you're done. > We're not talking about user input or anything like that here. > > If you start checking for things that can't happen, and return errors > for them, you start gracefully handling things that don't have anything > graceful about them. But there's nothing graceful to do here: you just return from your probe function that you couldn't probe and that's it. Just like you do when you can't map your registers, or get your interrupt, or register into any framework (including drm_dev_register that pretty much every driver handles properly if it returns an error, without being graceful about it). > Having such checks in place trains people to think they *may* happen. In most cases, kmalloc can't fail. We seem to have a very different policy towards it. > While it should fail fast and loud at the developer's first smoke test, > and get fixed then and there. Returning an error + a warning also qualifies for "fail fast and loud". But keeps the system alive for someone to notice in any case. Maxime
On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > On Wed, Nov 29, 2023 at 11:38:42AM +0200, Jani Nikula wrote: >> On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: >> > Hi Ville, >> > >> > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: >> >> On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: >> >> > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: >> >> > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: >> >> > > > All the drm_connector_init variants take at least a pointer to the >> >> > > > device, connector and hooks implementation. >> >> > > > >> >> > > > However, none of them check their value before dereferencing those >> >> > > > pointers which can lead to a NULL-pointer dereference if the author >> >> > > > isn't careful. >> >> > > >> >> > > Arguably oopsing on the spot is preferrable when this can't be caused by >> >> > > user input. It's always a mistake that should be caught early during >> >> > > development. >> >> > > >> >> > > Not everyone checks the return value of drm_connector_init and friends, >> >> > > so those cases will lead to more mysterious bugs later. And probably >> >> > > oopses as well. >> >> > >> >> > So maybe we can do both then, with something like >> >> > >> >> > if (WARN_ON(!dev)) >> >> > return -EINVAL >> >> > >> >> > if (drm_WARN_ON(dev, !connector || !funcs)) >> >> > return -EINVAL; >> >> > >> >> > I'd still like to check for this, so we can have proper testing, and we >> >> > already check for those pointers in some places (like funcs in >> >> > drm_connector_init), so if we don't cover everything we're inconsistent. >> >> >> >> People will invariably cargo-cult this kind of stuff absolutely >> >> everywhere and then all your functions will have tons of dead >> >> code to check their arguments. >> > >> > And that's a bad thing because... ? >> > >> > Also, are you really saying that checking that your arguments make sense >> > is cargo-cult? >> >> It's a powerful thing to be able to assume a NULL argument is always a >> fatal programming error on the caller's side, and should oops and get >> caught immediately. It's an assertion. > > Yeah, but we're not really doing that either. We have no explicit > assertion anywhere. We take a pointer in, and just hope that it will be > dereferenced later on and that the kernel will crash. The pointer to the > functions especially is only deferenced very later on. > > And assertions might be powerful, but being able to notice errors and > debug them is too. A panic takes away basically any remote access to > debug. If you don't have a console, you're done. > >> We're not talking about user input or anything like that here. >> >> If you start checking for things that can't happen, and return errors >> for them, you start gracefully handling things that don't have anything >> graceful about them. > > But there's nothing graceful to do here: you just return from your probe > function that you couldn't probe and that's it. Just like you do when > you can't map your registers, or get your interrupt, or register into > any framework (including drm_dev_register that pretty much every driver > handles properly if it returns an error, without being graceful about > it). Those are all dynamic things that can fail. Quite different from passing NULL dev, connector, or funcs to drm_connector_init() and friends. I think it's wrong to set the example that everything needs to be checked, everything needs to return an error, every call needs to check for error return, all the time, everywhere. People absolutely will cargo cult that, and that's what Ville is referring to. If you pass NULL dev, connector, or funcs to drm_connector_init() I think you absolutely deserve to get an oops. For dev, you could possibly not have reached the function with NULL dev. (And __drm_connector_init() has dev->mode_config before the check, so you'll get a static analyzer warning about dereference before the check.) If you have NULL connector, you didn't check for allocation failure earlier. If you have NULL funcs, you just passed NULL, because it's generally supposed to be a pointer to a static const struct. >> Having such checks in place trains people to think they *may* happen. > > In most cases, kmalloc can't fail. We seem to have a very different > policy towards it. Again, dynamic in nature and can fail. >> While it should fail fast and loud at the developer's first smoke test, >> and get fixed then and there. > > Returning an error + a warning also qualifies for "fail fast and loud". > But keeps the system alive for someone to notice in any case. But where do you draw the line? If we keep adding these checks to things that actually can't happen, we teach developers we need to check for impossible things. And we teach them not to trust anything. I scroll down the file and reach drm_connector_attach_edid_property(). Should we NULL check connector? Should we change the function to int and return a value? Should the caller check the value? Then there's drm_connector_attach_encoder(). And drm_connector_has_possible_encoder(). And so on and so forth. Where do you draw the line? BR, Jani.
On Wed, Nov 29, 2023 at 01:40:38PM +0200, Jani Nikula wrote: > On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > On Wed, Nov 29, 2023 at 11:38:42AM +0200, Jani Nikula wrote: > >> On Wed, 29 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > >> > Hi Ville, > >> > > >> > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > >> >> On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > >> >> > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > >> >> > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > >> >> > > > All the drm_connector_init variants take at least a pointer to the > >> >> > > > device, connector and hooks implementation. > >> >> > > > > >> >> > > > However, none of them check their value before dereferencing those > >> >> > > > pointers which can lead to a NULL-pointer dereference if the author > >> >> > > > isn't careful. > >> >> > > > >> >> > > Arguably oopsing on the spot is preferrable when this can't be caused by > >> >> > > user input. It's always a mistake that should be caught early during > >> >> > > development. > >> >> > > > >> >> > > Not everyone checks the return value of drm_connector_init and friends, > >> >> > > so those cases will lead to more mysterious bugs later. And probably > >> >> > > oopses as well. > >> >> > > >> >> > So maybe we can do both then, with something like > >> >> > > >> >> > if (WARN_ON(!dev)) > >> >> > return -EINVAL > >> >> > > >> >> > if (drm_WARN_ON(dev, !connector || !funcs)) > >> >> > return -EINVAL; > >> >> > > >> >> > I'd still like to check for this, so we can have proper testing, and we > >> >> > already check for those pointers in some places (like funcs in > >> >> > drm_connector_init), so if we don't cover everything we're inconsistent. > >> >> > >> >> People will invariably cargo-cult this kind of stuff absolutely > >> >> everywhere and then all your functions will have tons of dead > >> >> code to check their arguments. > >> > > >> > And that's a bad thing because... ? > >> > > >> > Also, are you really saying that checking that your arguments make sense > >> > is cargo-cult? > >> > >> It's a powerful thing to be able to assume a NULL argument is always a > >> fatal programming error on the caller's side, and should oops and get > >> caught immediately. It's an assertion. > > > > Yeah, but we're not really doing that either. We have no explicit > > assertion anywhere. We take a pointer in, and just hope that it will be > > dereferenced later on and that the kernel will crash. The pointer to the > > functions especially is only deferenced very later on. > > > > And assertions might be powerful, but being able to notice errors and > > debug them is too. A panic takes away basically any remote access to > > debug. If you don't have a console, you're done. > > > >> We're not talking about user input or anything like that here. > >> > >> If you start checking for things that can't happen, and return errors > >> for them, you start gracefully handling things that don't have anything > >> graceful about them. > > > > But there's nothing graceful to do here: you just return from your probe > > function that you couldn't probe and that's it. Just like you do when > > you can't map your registers, or get your interrupt, or register into > > any framework (including drm_dev_register that pretty much every driver > > handles properly if it returns an error, without being graceful about > > it). > > Those are all dynamic things that can fail. > > Quite different from passing NULL dev, connector, or funcs to > drm_connector_init() and friends. > > I think it's wrong to set the example that everything needs to be > checked, everything needs to return an error, every call needs to check > for error return, all the time, everywhere. People absolutely will cargo > cult that, and that's what Ville is referring to. > > If you pass NULL dev, connector, or funcs to drm_connector_init() I > think you absolutely deserve to get an oops. > > For dev, you could possibly not have reached the function with NULL > dev. (And __drm_connector_init() has dev->mode_config before the check, > so you'll get a static analyzer warning about dereference before the > check.) If you have NULL connector, you didn't check for allocation > failure earlier. If you have NULL funcs, you just passed NULL, because > it's generally supposed to be a pointer to a static const struct. > > >> Having such checks in place trains people to think they *may* happen. > > > > In most cases, kmalloc can't fail. We seem to have a very different > > policy towards it. > > Again, dynamic in nature and can fail. > > >> While it should fail fast and loud at the developer's first smoke test, > >> and get fixed then and there. > > > > Returning an error + a warning also qualifies for "fail fast and loud". > > But keeps the system alive for someone to notice in any case. > > But where do you draw the line? This also applies to static things then. drm_connector_attach_scaling_mode_property() or drm_mode_create_colorspace_property() (or plenty of others) will check on the value of the supported scaling modes colorspaces, even though they are static. It looks like we have that policy of "just assert and roll with it" for pointers, but not for other static values passed to those initialization functions. > If we keep adding these checks to things that actually can't happen, > we teach developers we need to check for impossible things. And we > teach them not to trust anything. Well, I certainly don't trust drivers to get things right. > I scroll down the file and reach drm_connector_attach_edid_property(). > Should we NULL check connector? Should we change the function to int > and return a value? Should the caller check the value? Then there's > drm_connector_attach_encoder(). And > drm_connector_has_possible_encoder(). And so on and so forth. > > Where do you draw the line? If things can fail, we should expect the caller to handle the failure somehow. The documentation of drm_connector_attach_encoder() states that it can fail, so we should expect it. drm_connector_has_possible_encoder() doesn't so we can assume it can't fail. If the function can fail but wasn't designed or documented as such, then it's on the function. If it was but the caller didn't handle the error case, then that's on the caller. Maxime
On Wed, Nov 29, 2023 at 11:18:24AM +0200, Ville Syrjälä wrote: > On Wed, Nov 29, 2023 at 10:11:26AM +0100, Maxime Ripard wrote: > > Hi Ville, > > > > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > > > On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > > > > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > > > > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > All the drm_connector_init variants take at least a pointer to the > > > > > > device, connector and hooks implementation. > > > > > > > > > > > > However, none of them check their value before dereferencing those > > > > > > pointers which can lead to a NULL-pointer dereference if the author > > > > > > isn't careful. > > > > > > > > > > Arguably oopsing on the spot is preferrable when this can't be caused by > > > > > user input. It's always a mistake that should be caught early during > > > > > development. > > > > > > > > > > Not everyone checks the return value of drm_connector_init and friends, > > > > > so those cases will lead to more mysterious bugs later. And probably > > > > > oopses as well. > > > > > > > > So maybe we can do both then, with something like > > > > > > > > if (WARN_ON(!dev)) > > > > return -EINVAL > > > > > > > > if (drm_WARN_ON(dev, !connector || !funcs)) > > > > return -EINVAL; > > > > > > > > I'd still like to check for this, so we can have proper testing, and we > > > > already check for those pointers in some places (like funcs in > > > > drm_connector_init), so if we don't cover everything we're inconsistent. > > > > > > People will invariably cargo-cult this kind of stuff absolutely > > > everywhere and then all your functions will have tons of dead > > > code to check their arguments. > > > > And that's a bad thing because... ? > > > > Also, are you really saying that checking that your arguments make sense > > is cargo-cult? > > > > We're already doing it in some parts of KMS, so we have to be > > consistent, and the answer to "most drivers don't check the error" > > cannot be "let's just give on error checking then". > > > > > I'd prefer not to go there usually. > > > > > > Should we perhaps start to use the (arguably hideous) > > > - void f(struct foo *bar) > > > + void f(struct foo bar[static 1]) > > > syntax to tell the compiler we don't accept NULL pointers? > > > > > > Hmm. Apparently that has the same problem as using any > > > other kind of array syntax in the prototype. That is, > > > the compiler demands to know the definition of 'struct foo' > > > even though we're passing in effectively a pointer. Sigh. > > > > Honestly, I don't care as long as it's something we can unit-test to > > make sure we make it consistent. We can't unit test a complete kernel > > crash. > > Why do you want to put utterly broken code into a unit test? Utterly broken code happens. It probably shouldn't, but here we are. Anyway, you mostly missed the consistent part. The current state with it is: - planes: - drm_universal_plane_init warns if funcs->destroy NULL - drm_universal_plane_alloc errors out if funcs is NULL - drmm_universal_plane_alloc warns and errors out if funcs or funcs->destroy are NULL - CRTC: - drm_crtc_init_with_planes warns if funcs->destroy NULL - drmm_crtc_init_with_planes warns if funcs or funcs->destroy are NULL - drmm_crtc_alloc_with_planes warns and errors out if funcs or funcs->destroy are NULL - encoder: - drm_encoder_init warns if funcs->destroy NULL - drmm_encoder_init warns and errors out if funcs or funcs->destroy are NULL - drmm_encoder_alloc warns and errors out if funcs or funcs->destroy are NULL - connectors: - drm_connector_init warns and errors out if funcs or funcs->destroy are NULL - drm_connector_init_with_ddc warns and errors out if funcs or funcs->destroy are NULL - drmm_connector_init warns and errors out if funcs or funcs->destroy are NULL I think that just proves that your opinion is just not as clear cut as you'd like it to be, and it's far from being the policy you claim it is. Plus, we're not even remotely consistent there, and we're not documenting that anywhere. And we have plenty of other examples of static stuff being checked because it just makes sense. All variants of drm_crtc_init_with_planes will for example check that the correct plane type is associated to the primary and cursor planes. We should fix that. Maxime
On Fri, Dec 01, 2023 at 10:01:49AM +0100, Maxime Ripard wrote: > On Wed, Nov 29, 2023 at 11:18:24AM +0200, Ville Syrjälä wrote: > > On Wed, Nov 29, 2023 at 10:11:26AM +0100, Maxime Ripard wrote: > > > Hi Ville, > > > > > > On Tue, Nov 28, 2023 at 03:49:08PM +0200, Ville Syrjälä wrote: > > > > On Tue, Nov 28, 2023 at 02:29:40PM +0100, Maxime Ripard wrote: > > > > > On Tue, Nov 28, 2023 at 02:54:02PM +0200, Jani Nikula wrote: > > > > > > On Tue, 28 Nov 2023, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > > All the drm_connector_init variants take at least a pointer to the > > > > > > > device, connector and hooks implementation. > > > > > > > > > > > > > > However, none of them check their value before dereferencing those > > > > > > > pointers which can lead to a NULL-pointer dereference if the author > > > > > > > isn't careful. > > > > > > > > > > > > Arguably oopsing on the spot is preferrable when this can't be caused by > > > > > > user input. It's always a mistake that should be caught early during > > > > > > development. > > > > > > > > > > > > Not everyone checks the return value of drm_connector_init and friends, > > > > > > so those cases will lead to more mysterious bugs later. And probably > > > > > > oopses as well. > > > > > > > > > > So maybe we can do both then, with something like > > > > > > > > > > if (WARN_ON(!dev)) > > > > > return -EINVAL > > > > > > > > > > if (drm_WARN_ON(dev, !connector || !funcs)) > > > > > return -EINVAL; > > > > > > > > > > I'd still like to check for this, so we can have proper testing, and we > > > > > already check for those pointers in some places (like funcs in > > > > > drm_connector_init), so if we don't cover everything we're inconsistent. > > > > > > > > People will invariably cargo-cult this kind of stuff absolutely > > > > everywhere and then all your functions will have tons of dead > > > > code to check their arguments. > > > > > > And that's a bad thing because... ? > > > > > > Also, are you really saying that checking that your arguments make sense > > > is cargo-cult? > > > > > > We're already doing it in some parts of KMS, so we have to be > > > consistent, and the answer to "most drivers don't check the error" > > > cannot be "let's just give on error checking then". > > > > > > > I'd prefer not to go there usually. > > > > > > > > Should we perhaps start to use the (arguably hideous) > > > > - void f(struct foo *bar) > > > > + void f(struct foo bar[static 1]) > > > > syntax to tell the compiler we don't accept NULL pointers? > > > > > > > > Hmm. Apparently that has the same problem as using any > > > > other kind of array syntax in the prototype. That is, > > > > the compiler demands to know the definition of 'struct foo' > > > > even though we're passing in effectively a pointer. Sigh. > > > > > > Honestly, I don't care as long as it's something we can unit-test to > > > make sure we make it consistent. We can't unit test a complete kernel > > > crash. > > > > Why do you want to put utterly broken code into a unit test? > > Utterly broken code happens. It probably shouldn't, but here we are. > > Anyway, you mostly missed the consistent part. > > The current state with it is: > > - planes: > - drm_universal_plane_init warns if funcs->destroy NULL > - drm_universal_plane_alloc errors out if funcs is NULL > - drmm_universal_plane_alloc warns and errors out if funcs or funcs->destroy are NULL > > - CRTC: > - drm_crtc_init_with_planes warns if funcs->destroy NULL > - drmm_crtc_init_with_planes warns if funcs or funcs->destroy are NULL > - drmm_crtc_alloc_with_planes warns and errors out if funcs or funcs->destroy are NULL > > - encoder: > - drm_encoder_init warns if funcs->destroy NULL > - drmm_encoder_init warns and errors out if funcs or funcs->destroy are NULL > - drmm_encoder_alloc warns and errors out if funcs or funcs->destroy are NULL > > - connectors: > - drm_connector_init warns and errors out if funcs or funcs->destroy are NULL > - drm_connector_init_with_ddc warns and errors out if funcs or funcs->destroy are NULL > - drmm_connector_init warns and errors out if funcs or funcs->destroy are NULL Those are perhaps fine, as things may not oops immediately if you get that wrong. But NULL checking things that will anyway oops immediately, (and there's not chance the thing will do anything useful if you prevent the oops with a NULL check) doesn't make any sense to me. It just makes you doubt the entire universe when you see code like that. > > I think that just proves that your opinion is just not as clear cut as > you'd like it to be, and it's far from being the policy you claim it is. > > Plus, we're not even remotely consistent there, and we're not > documenting that anywhere. > > And we have plenty of other examples of static stuff being checked > because it just makes sense. All variants of drm_crtc_init_with_planes > will for example check that the correct plane type is associated to the > primary and cursor planes. > > We should fix that. Ideally the compiler should catch all of this. But given we are implementing this in C (and even C23 constexpr was neutered beyond uselessness) that's not really possible in any nice way :(
On Wed, Nov 29, 2023 at 12:25:37PM +0200, Ville Syrjälä wrote: > On Wed, Nov 29, 2023 at 12:12:59PM +0200, Pekka Paalanen wrote: > > On Tue, 28 Nov 2023 15:49:08 +0200 > > Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > > > Should we perhaps start to use the (arguably hideous) > > > - void f(struct foo *bar) > > > + void f(struct foo bar[static 1]) > > > syntax to tell the compiler we don't accept NULL pointers? > > > > > > Hmm. Apparently that has the same problem as using any > > > other kind of array syntax in the prototype. That is, > > > the compiler demands to know the definition of 'struct foo' > > > even though we're passing in effectively a pointer. Sigh. > > > > > > __attribute__((nonnull)) ? > > I guess that would work, though the syntax is horrible when > you need to flag specific arguments. I played around with this a bit (blindly cocci'd tons of drm and i915 function declarations with the nonnull attribute) and it's somewhat underwhelming unfortunately. It will trip only if the compiler is 100% sure you're passing in a NULL. There is no way to eg. tell the compiler that a function can return a NULL and thus anything coming from it should be checked by the caller before passing it on to something with the nonnull attribute. And I suppose error pointers would also screw that idea over anyway. Additionally the NULL device checks being being done in the drm_err/dbg macros trip this up left right and center. And hiding that check inside a function (instead of having it in the macro) is also ruined by the fact that we apparently pass different types of pointers to these macros :( Generics could be used to sort out that type mess I suppose, or the code that passes the wrong type (DSI code at least) should just be changed to not do that. But not sure if there's enough benefit to warrant the work.
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index b0516505f7ae..2f60755dccdd 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -229,6 +229,9 @@ static int __drm_connector_init(struct drm_device *dev, struct ida *connector_ida = &drm_connector_enum_list[connector_type].ida; + if (!dev || !connector || !funcs) + return -EINVAL; + WARN_ON(drm_drv_uses_atomic_modeset(dev) && (!funcs->atomic_destroy_state || !funcs->atomic_duplicate_state));
All the drm_connector_init variants take at least a pointer to the device, connector and hooks implementation. However, none of them check their value before dereferencing those pointers which can lead to a NULL-pointer dereference if the author isn't careful. Let's test those pointers instead and error out if any is NULL. Signed-off-by: Maxime Ripard <mripard@kernel.org> --- drivers/gpu/drm/drm_connector.c | 3 +++ 1 file changed, 3 insertions(+)