Message ID | E1YW7sq-0003m6-89@rmk-PC.arm.linux.org.uk (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On 12 March 2015 at 19:31, Russell King <rmk+kernel@arm.linux.org.uk> wrote: > pm_genpd_remove_device() should only be called with valid and present > pm domain. There are circumstances where we may end up with something > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo > stuff.) Could the "vga_switcheroo" code deal with this instead? > > Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > --- > drivers/base/power/domain.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index b3fbc21da2dc..11a1023fa64a 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -1509,7 +1509,7 @@ int pm_genpd_remove_device(struct generic_pm_domain *genpd, > > dev_dbg(dev, "%s()\n", __func__); > > - if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev) > + if (!pm_genpd_present(genpd) || IS_ERR_OR_NULL(dev) The are two issues with this approach. 1. pm_genpd_present() is build only for CONFIG_PM_SLEEP set. 2. pm_genpd_present() totally ignores holding the mutex which protects the list of GPDs. > || IS_ERR_OR_NULL(dev->pm_domain) > || pd_to_genpd(dev->pm_domain) != genpd) > return -EINVAL; > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 13, 2015 at 09:56:02AM +0100, Ulf Hansson wrote: > On 12 March 2015 at 19:31, Russell King <rmk+kernel@arm.linux.org.uk> wrote: > > pm_genpd_remove_device() should only be called with valid and present > > pm domain. There are circumstances where we may end up with something > > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo > > stuff.) > > Could the "vga_switcheroo" code deal with this instead? How is there any possibility what so ever that vga_switcherroo could deal with this? The problem is if something which isn't a generic PM domain is registered in dev->pm_domain, pm_genpd_remove_device() will treat it as a generic PM domain, and try and de-reference it as if it were a generic PM domain. The problem is the generic PM domain code _assuming_ that whatever it finds in dev->pm_domain is always a generic PM domain. That is a broken assumption. > > > > > Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > > --- > > drivers/base/power/domain.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > index b3fbc21da2dc..11a1023fa64a 100644 > > --- a/drivers/base/power/domain.c > > +++ b/drivers/base/power/domain.c > > @@ -1509,7 +1509,7 @@ int pm_genpd_remove_device(struct generic_pm_domain *genpd, > > > > dev_dbg(dev, "%s()\n", __func__); > > > > - if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev) > > + if (!pm_genpd_present(genpd) || IS_ERR_OR_NULL(dev) > > The are two issues with this approach. > > 1. pm_genpd_present() is build only for CONFIG_PM_SLEEP set. > 2. pm_genpd_present() totally ignores holding the mutex which protects > the list of GPDs. Okay, I'll fix both of those by making it always available and by taking the appropriate lock, and removing the duplication in genpd_dev_pm_detach().
On Fri, Mar 13, 2015 at 10:20 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Mar 13, 2015 at 09:56:02AM +0100, Ulf Hansson wrote: >> On 12 March 2015 at 19:31, Russell King <rmk+kernel@arm.linux.org.uk> wrote: >> > pm_genpd_remove_device() should only be called with valid and present >> > pm domain. There are circumstances where we may end up with something >> > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo >> > stuff.) >> >> Could the "vga_switcheroo" code deal with this instead? > > How is there any possibility what so ever that vga_switcherroo could > deal with this? > > The problem is if something which isn't a generic PM domain is registered > in dev->pm_domain, pm_genpd_remove_device() will treat it as a generic PM > domain, and try and de-reference it as if it were a generic PM domain. > > The problem is the generic PM domain code _assuming_ that whatever it > finds in dev->pm_domain is always a generic PM domain. That is a broken > assumption. Indeed, currently it's not possible to know which derivative of dev_pm_domain is used. For genpd, you have to look it up in the genpd list. This also complicates adding debug code. In addition, it can be statically or dynamically allocated. Hence sometimes %ps can tell you the type, sometimes it can't. Should we add a type field to struct dev_pm_domain? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 13, 2015 at 09:56:02AM +0100, Ulf Hansson wrote: > On 12 March 2015 at 19:31, Russell King <rmk+kernel@arm.linux.org.uk> wrote: > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > index b3fbc21da2dc..11a1023fa64a 100644 > > --- a/drivers/base/power/domain.c > > +++ b/drivers/base/power/domain.c > > @@ -1509,7 +1509,7 @@ int pm_genpd_remove_device(struct generic_pm_domain *genpd, > > > > dev_dbg(dev, "%s()\n", __func__); > > > > - if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev) > > + if (!pm_genpd_present(genpd) || IS_ERR_OR_NULL(dev) > > The are two issues with this approach. > > 1. pm_genpd_present() is build only for CONFIG_PM_SLEEP set. > 2. pm_genpd_present() totally ignores holding the mutex which protects > the list of GPDs. Well, having looked more deeply at this, the PM domain code has more problems. void pm_genpd_syscore_poweroff(struct device *dev) { genpd_syscore_switch(dev, true); } EXPORT_SYMBOL_GPL(pm_genpd_syscore_poweroff); void pm_genpd_syscore_poweron(struct device *dev) { genpd_syscore_switch(dev, false); } EXPORT_SYMBOL_GPL(pm_genpd_syscore_poweron); both of these functions are externally callable without holding the gpd_list_lock mutex, which results in: static void genpd_syscore_switch(struct device *dev, bool suspend) { struct generic_pm_domain *genpd; genpd = dev_to_genpd(dev); if (!pm_genpd_present(genpd)) return; calling this function without holding the gpd_list_lock either. It also seems that there's a variety of ways which the generic PM domain code gets from a struct device to a generic_pm_domain - one is the above. Another is having this open coded: mutex_lock(&gpd_list_lock); list_for_each_entry(gpd, &gpd_list, gpd_list_node) { if (&gpd->domain == dev->pm_domain) { pd = gpd; break; } } mutex_unlock(&gpd_list_lock); This is only _not_ buggy because we only ever add to the list, never remove from it, so we can be sure that a successfully obtained "pd" will remain valid after the lock is dropped. Another is: genpd = dev_to_genpd(dev); if (IS_ERR(genpd)) ... when we're sure that dev->pm_domain points at a generic PM domain (in the various runtime PM functions.) It should be also noted that dev_to_genpd() is exported to the whole kernel - is that safe? In any case, I think dev_to_genpd() at least needs some commentry before it saying that it is only for use when we know for certain that the device has a generic_pm_domain attached, or the PM domain power is NULL or an error pointer. This is turning out to be a right can of worms...
Russell King <rmk+kernel@arm.linux.org.uk> writes: > pm_genpd_remove_device() should only be called with valid and present > pm domain. Terminology nit: the patch adds a check for a valid an present genpd, not pm domain (which I think of as dev->pm_domain.) > There are circumstances where we may end up with something > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo > stuff.) Wouldn't that particular case be caught by this check in pm_genpd_remove_device(): || pd_to_genpd(dev->pm_domain) != genpd Maybe I'm not quite following your description, but it looks like someone is trying to call pm_genpd_remove_device() on something that was not added with pm_genpd_add_device(). Where/how is that happening? Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 13, 2015 at 09:33:14AM -0700, Kevin Hilman wrote: > Russell King <rmk+kernel@arm.linux.org.uk> writes: > > > pm_genpd_remove_device() should only be called with valid and present > > pm domain. > > Terminology nit: the patch adds a check for a valid an present genpd, > not pm domain (which I think of as dev->pm_domain.) Right, people are sloppy with terminology, it's my turn to be equally sloppy :) I'll fix it eventually (but I've just sent the next round of patches.) Can you re-comment on the new set so it doesn't get lost please? I'm not about to immediately go through another update round for such a trivial change (which alone takes about 5 minutes due to the number of branches and patches on top of this stuff). > > There are circumstances where we may end up with something > > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo > > stuff.) > > Wouldn't that particular case be caught by this check in pm_genpd_remove_device(): > > || pd_to_genpd(dev->pm_domain) != genpd Well, let's look at what pd_to_genpd() does: static inline struct generic_pm_domain *pd_to_genpd(struct dev_pm_domain *pd) { return container_of(pd, struct generic_pm_domain, domain); } Now, both pd_to_genpd() and dev_to_genpd() are available for use outside of the genpd code. What that means is that other code _can_ decide to do this (and indeed I do have code which does this): struct generic_pm_domain *genpd = dev_to_genpd(dev); while (1) { if (pm_genpd_remove_device(genpd, dev) != -EAGAIN) break; cond_resched(); } Now, as there is no way for anything outside of genpd to know whether the return value from this is a real genpd structure or not - and pm_genpd_remove_device() tries hard to validate the genpd passed in - but not sufficiently to catch bad cases where dev->pm_domain is not attached to a genpd. What I'm doing is ensuring that the weak-but-looking-like-belt-and-braces test in pm_genpd_remove_device() is actually sufficiently strong to catch bad uses which may otherwise result in subtle out-of-bounds corruption or list corruption. Now, the alternative solution would be to have pm_genpd_lookup_name() exported from the genpd code, so I can actually look up the genpd which is _supposed_ to be registered against the device and use that instead. However, that's potentially dangerous because there is no refcounting against genpds obtained in this way. So I decided better validation was the lesser of the evils.
On Friday, March 13, 2015 01:45:52 PM Geert Uytterhoeven wrote: > On Fri, Mar 13, 2015 at 10:20 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Fri, Mar 13, 2015 at 09:56:02AM +0100, Ulf Hansson wrote: > >> On 12 March 2015 at 19:31, Russell King <rmk+kernel@arm.linux.org.uk> wrote: > >> > pm_genpd_remove_device() should only be called with valid and present > >> > pm domain. There are circumstances where we may end up with something > >> > that isn't a generic PM domain in dev->pm_domain (eg, vga_switcheroo > >> > stuff.) > >> > >> Could the "vga_switcheroo" code deal with this instead? > > > > How is there any possibility what so ever that vga_switcherroo could > > deal with this? > > > > The problem is if something which isn't a generic PM domain is registered > > in dev->pm_domain, pm_genpd_remove_device() will treat it as a generic PM > > domain, and try and de-reference it as if it were a generic PM domain. > > > > The problem is the generic PM domain code _assuming_ that whatever it > > finds in dev->pm_domain is always a generic PM domain. That is a broken > > assumption. > > Indeed, currently it's not possible to know which derivative of dev_pm_domain > is used. For genpd, you have to look it up in the genpd list. > This also complicates adding debug code. > In addition, it can be statically or dynamically allocated. Hence sometimes %ps > can tell you the type, sometimes it can't. > > Should we add a type field to struct dev_pm_domain? Then we'd have to keep track of all dev_pm_domain derivatives somehow and I'd prefer to avoid that if possible.
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index b3fbc21da2dc..11a1023fa64a 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1509,7 +1509,7 @@ int pm_genpd_remove_device(struct generic_pm_domain *genpd, dev_dbg(dev, "%s()\n", __func__); - if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev) + if (!pm_genpd_present(genpd) || IS_ERR_OR_NULL(dev) || IS_ERR_OR_NULL(dev->pm_domain) || pd_to_genpd(dev->pm_domain) != genpd) return -EINVAL;