Message ID | 1340990551-19426-1-git-send-email-jon-hunter@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 06/29/2012 10:22 AM, Jon Hunter wrote: > Currently the gpio _runtime_resume/suspend functions are calling the > get_context_loss_count() platform function if the function is populated for > a gpio bank. This function is used to determine if the gpio bank logic state > needs to be restored due to a power transition. This function will be populated > for all banks, but it should only be called for banks that have the > "loses_context" variable set. It is pointless to call this if loses_context is > false as we know the context will never be lost and will not need restoring. > > For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will > never lose context. We found that the get_context_loss_count() was being called > for bank-0 during the probe and returning 1 instead of 0 indicating that the > context had been lost. This was causing the context restore function to be > called at probe time for this bank and because the context had never been saved, > was restoring an invalid state. This ultimately resulted in a crash [1]. > > There are multiple bugs here that need to be addressed ... > > 1. Why the always-on power domain returns a context loss count of 1? This needs > to be fixed in the power domain code. However, the gpio driver should not > assume the loss count is 0 to begin with. > 2. The omap gpio driver should never be calling get_context_loss_count for a > gpio bank in a always-on domain. This is pointless and adds unneccessary > overhead. > 3. The OMAP gpio driver assumes that the initial power domain context loss count > will be 0 at the time the gpio driver is probed. However, it could be > possible that this is not the case and an invalid context restore could be > performed during the probe. To avoid this otherwise only populated the > get_context_loss_count() function pointer after the initial call to > pm_runtime_get() has occurred. This will ensure that the first > pm_runtime_put() initialised the loss count correctly. > > This patch addresses issues 2 and 3 above. > > [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 > > Cc: Grant Likely <grant.likely@secretlab.ca> > Cc: Linus Walleij <linus.walleij@stericsson.com> > Cc: Kevin Hilman <khilman@ti.com> > Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> > Cc: Franky Lin <frankyl@broadcom.com> > > Reported-by: Franky Lin <frankyl@broadcom.com> > Signed-off-by: Jon Hunter <jon-hunter@ti.com> > --- Tested-by: Franky Lin <frankyl@broadcom.com> -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote: > Currently the gpio _runtime_resume/suspend functions are calling the > get_context_loss_count() platform function if the function is populated for > a gpio bank. This function is used to determine if the gpio bank logic state > needs to be restored due to a power transition. This function will be populated > for all banks, but it should only be called for banks that have the > "loses_context" variable set. It is pointless to call this if loses_context is > false as we know the context will never be lost and will not need restoring. > > For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will > never lose context. We found that the get_context_loss_count() was being called > for bank-0 during the probe and returning 1 instead of 0 indicating that the > context had been lost. This was causing the context restore function to be > called at probe time for this bank and because the context had never been saved, > was restoring an invalid state. This ultimately resulted in a crash [1]. > > There are multiple bugs here that need to be addressed ... > > 1. Why the always-on power domain returns a context loss count of 1? This needs > to be fixed in the power domain code. However, the gpio driver should not > assume the loss count is 0 to begin with. Indeed. GPIO driver should not assume the value. > 2. The omap gpio driver should never be calling get_context_loss_count for a > gpio bank in a always-on domain. This is pointless and adds unneccessary > overhead. Make sense too. > 3. The OMAP gpio driver assumes that the initial power domain context loss count > will be 0 at the time the gpio driver is probed. However, it could be > possible that this is not the case and an invalid context restore could be > performed during the probe. To avoid this otherwise only populated the > get_context_loss_count() function pointer after the initial call to > pm_runtime_get() has occurred. This will ensure that the first > pm_runtime_put() initialised the loss count correctly. > > This patch addresses issues 2 and 3 above. > > [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 > > Cc: Grant Likely <grant.likely@secretlab.ca> > Cc: Linus Walleij <linus.walleij@stericsson.com> > Cc: Kevin Hilman <khilman@ti.com> > Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> > Cc: Franky Lin <frankyl@broadcom.com> > > Reported-by: Franky Lin <frankyl@broadcom.com> > Signed-off-by: Jon Hunter <jon-hunter@ti.com> > --- Thanks Jon for sorting this out. Patch looks good to me. Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Shilimkar, Santosh <santosh.shilimkar@ti.com> [120629 21:23]: > On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote: > > Currently the gpio _runtime_resume/suspend functions are calling the > > get_context_loss_count() platform function if the function is populated for > > a gpio bank. This function is used to determine if the gpio bank logic state > > needs to be restored due to a power transition. This function will be populated > > for all banks, but it should only be called for banks that have the > > "loses_context" variable set. It is pointless to call this if loses_context is > > false as we know the context will never be lost and will not need restoring. > > > > For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will > > never lose context. We found that the get_context_loss_count() was being called > > for bank-0 during the probe and returning 1 instead of 0 indicating that the > > context had been lost. This was causing the context restore function to be > > called at probe time for this bank and because the context had never been saved, > > was restoring an invalid state. This ultimately resulted in a crash [1]. > > > > There are multiple bugs here that need to be addressed ... > > > > 1. Why the always-on power domain returns a context loss count of 1? This needs > > to be fixed in the power domain code. However, the gpio driver should not > > assume the loss count is 0 to begin with. > Indeed. GPIO driver should not assume the value. > > > 2. The omap gpio driver should never be calling get_context_loss_count for a > > gpio bank in a always-on domain. This is pointless and adds unneccessary > > overhead. > Make sense too. > > > 3. The OMAP gpio driver assumes that the initial power domain context loss count > > will be 0 at the time the gpio driver is probed. However, it could be > > possible that this is not the case and an invalid context restore could be > > performed during the probe. To avoid this otherwise only populated the > > get_context_loss_count() function pointer after the initial call to > > pm_runtime_get() has occurred. This will ensure that the first > > pm_runtime_put() initialised the loss count correctly. > > > > This patch addresses issues 2 and 3 above. Should this one be Cc: stable? If this is a regression, then the regression causing commit should be mentioned. Tony -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
+ Neil Brown Hi Jon, Jon Hunter <jon-hunter@ti.com> writes: > Currently the gpio _runtime_resume/suspend functions are calling the > get_context_loss_count() platform function if the function is populated for > a gpio bank. This function is used to determine if the gpio bank logic state > needs to be restored due to a power transition. This function will be populated > for all banks, but it should only be called for banks that have the > "loses_context" variable set. It is pointless to call this if loses_context is > false as we know the context will never be lost and will not need restoring. > > For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will > never lose context. We found that the get_context_loss_count() was being called > for bank-0 during the probe and returning 1 instead of 0 indicating that the > context had been lost. This was causing the context restore function to be > called at probe time for this bank and because the context had never been saved, > was restoring an invalid state. This ultimately resulted in a crash [1]. > > There are multiple bugs here that need to be addressed ... > > 1. Why the always-on power domain returns a context loss count of 1? This needs > to be fixed in the power domain code. However, the gpio driver should not > assume the loss count is 0 to begin with. > 2. The omap gpio driver should never be calling get_context_loss_count for a > gpio bank in a always-on domain. This is pointless and adds unneccessary > overhead. > 3. The OMAP gpio driver assumes that the initial power domain context loss count > will be 0 at the time the gpio driver is probed. However, it could be > possible that this is not the case and an invalid context restore could be > performed during the probe. To avoid this otherwise only populated the The 'To avoid this...' sentence here doesn't read well. Looks like you need to: s/otherwise// s/populated/populate/ ? > get_context_loss_count() function pointer after the initial call to > pm_runtime_get() has occurred. This will ensure that the first > pm_runtime_put() initialised the loss count correctly. > > This patch addresses issues 2 and 3 above. > [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 > > Cc: Grant Likely <grant.likely@secretlab.ca> > Cc: Linus Walleij <linus.walleij@stericsson.com> > Cc: Kevin Hilman <khilman@ti.com> > Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> > Cc: Franky Lin <frankyl@broadcom.com> > > Reported-by: Franky Lin <frankyl@broadcom.com> > Signed-off-by: Jon Hunter <jon-hunter@ti.com> Thanks for digging inot this bug Jon. The same bug was brought up by Neil Brown (Cc'd) in a different thread. Neil, it looks to me that this fix will address the problems you were seeing as well. Care to test, and respond with your ack/tested-by if it works for you? Thanks. Kevin > --- > drivers/gpio/gpio-omap.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c > index c4ed172..f13fc9c 100644 > --- a/drivers/gpio/gpio-omap.c > +++ b/drivers/gpio/gpio-omap.c > @@ -1081,7 +1081,6 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) > bank->is_mpuio = pdata->is_mpuio; > bank->non_wakeup_gpios = pdata->non_wakeup_gpios; > bank->loses_context = pdata->loses_context; > - bank->get_context_loss_count = pdata->get_context_loss_count; > bank->regs = pdata->regs; > #ifdef CONFIG_OF_GPIO > bank->chip.of_node = of_node_get(node); > @@ -1135,6 +1134,9 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) > omap_gpio_chip_init(bank); > omap_gpio_show_rev(bank); > > + if (bank->loses_context) > + bank->get_context_loss_count = pdata->get_context_loss_count; > + > pm_runtime_put(bank->dev); > > list_add_tail(&bank->node, &omap_gpio_list); -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/01/2012 03:45 AM, Tony Lindgren wrote: > * Shilimkar, Santosh <santosh.shilimkar@ti.com> [120629 21:23]: >> On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote: >>> Currently the gpio _runtime_resume/suspend functions are calling the >>> get_context_loss_count() platform function if the function is populated for >>> a gpio bank. This function is used to determine if the gpio bank logic state >>> needs to be restored due to a power transition. This function will be populated >>> for all banks, but it should only be called for banks that have the >>> "loses_context" variable set. It is pointless to call this if loses_context is >>> false as we know the context will never be lost and will not need restoring. >>> >>> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will >>> never lose context. We found that the get_context_loss_count() was being called >>> for bank-0 during the probe and returning 1 instead of 0 indicating that the >>> context had been lost. This was causing the context restore function to be >>> called at probe time for this bank and because the context had never been saved, >>> was restoring an invalid state. This ultimately resulted in a crash [1]. >>> >>> There are multiple bugs here that need to be addressed ... >>> >>> 1. Why the always-on power domain returns a context loss count of 1? This needs >>> to be fixed in the power domain code. However, the gpio driver should not >>> assume the loss count is 0 to begin with. >> Indeed. GPIO driver should not assume the value. >> >>> 2. The omap gpio driver should never be calling get_context_loss_count for a >>> gpio bank in a always-on domain. This is pointless and adds unneccessary >>> overhead. >> Make sense too. >> >>> 3. The OMAP gpio driver assumes that the initial power domain context loss count >>> will be 0 at the time the gpio driver is probed. However, it could be >>> possible that this is not the case and an invalid context restore could be >>> performed during the probe. To avoid this otherwise only populated the >>> get_context_loss_count() function pointer after the initial call to >>> pm_runtime_get() has occurred. This will ensure that the first >>> pm_runtime_put() initialised the loss count correctly. >>> >>> This patch addresses issues 2 and 3 above. > > Should this one be Cc: stable? If this is a regression, then the regression > causing commit should be mentioned. So that raises a good point. Looking at the stable branch (3.4.4) it is missing 3 other fixes too [1][2][3]. So this particular problem would not have been exposed, however, I am wondering if there are other problems lingering there. This is a regression is exposed by [2]. I should add that to the changelog. Cheers Jon [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3c64bc30af67ed328a8d919e41160942b870451 [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1b1287032df3a69d3ef9a486b444f4ffcca50d01 [3] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=22770de11cb13e7120f973bca6c800de371a6717 -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/02/2012 01:07 PM, Kevin Hilman wrote: > + Neil Brown > > Hi Jon, > > Jon Hunter <jon-hunter@ti.com> writes: > >> Currently the gpio _runtime_resume/suspend functions are calling the >> get_context_loss_count() platform function if the function is populated for >> a gpio bank. This function is used to determine if the gpio bank logic state >> needs to be restored due to a power transition. This function will be populated >> for all banks, but it should only be called for banks that have the >> "loses_context" variable set. It is pointless to call this if loses_context is >> false as we know the context will never be lost and will not need restoring. >> >> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will >> never lose context. We found that the get_context_loss_count() was being called >> for bank-0 during the probe and returning 1 instead of 0 indicating that the >> context had been lost. This was causing the context restore function to be >> called at probe time for this bank and because the context had never been saved, >> was restoring an invalid state. This ultimately resulted in a crash [1]. >> >> There are multiple bugs here that need to be addressed ... >> >> 1. Why the always-on power domain returns a context loss count of 1? This needs >> to be fixed in the power domain code. However, the gpio driver should not >> assume the loss count is 0 to begin with. >> 2. The omap gpio driver should never be calling get_context_loss_count for a >> gpio bank in a always-on domain. This is pointless and adds unneccessary >> overhead. >> 3. The OMAP gpio driver assumes that the initial power domain context loss count >> will be 0 at the time the gpio driver is probed. However, it could be >> possible that this is not the case and an invalid context restore could be >> performed during the probe. To avoid this otherwise only populated the > > The 'To avoid this...' sentence here doesn't read well. Looks like you > need to: > > s/otherwise// Yes, I meant to have dropped "otherwise" here. Thanks! > s/populated/populate/ Yes that too! I must have re-worded and screwed it up royally :-( > ? > >> get_context_loss_count() function pointer after the initial call to >> pm_runtime_get() has occurred. This will ensure that the first >> pm_runtime_put() initialised the loss count correctly. >> >> This patch addresses issues 2 and 3 above. >> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 >> >> Cc: Grant Likely <grant.likely@secretlab.ca> >> Cc: Linus Walleij <linus.walleij@stericsson.com> >> Cc: Kevin Hilman <khilman@ti.com> >> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> >> Cc: Franky Lin <frankyl@broadcom.com> >> >> Reported-by: Franky Lin <frankyl@broadcom.com> >> Signed-off-by: Jon Hunter <jon-hunter@ti.com> > > Thanks for digging inot this bug Jon. The same bug was brought up by > Neil Brown (Cc'd) in a different thread. > > Neil, it looks to me that this fix will address the problems you were > seeing as well. Care to test, and respond with your ack/tested-by if it > works for you? Thanks. Neil let me know your thoughts and if you are ok, I can clean-up the changelog and re-send. Cheers Jon -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote: > > On 07/02/2012 01:07 PM, Kevin Hilman wrote: > > + Neil Brown > > > > Hi Jon, > > > > Jon Hunter <jon-hunter@ti.com> writes: > > > >> Currently the gpio _runtime_resume/suspend functions are calling the > >> get_context_loss_count() platform function if the function is populated for > >> a gpio bank. This function is used to determine if the gpio bank logic state > >> needs to be restored due to a power transition. This function will be populated > >> for all banks, but it should only be called for banks that have the > >> "loses_context" variable set. It is pointless to call this if loses_context is > >> false as we know the context will never be lost and will not need restoring. > >> > >> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will > >> never lose context. We found that the get_context_loss_count() was being called > >> for bank-0 during the probe and returning 1 instead of 0 indicating that the > >> context had been lost. This was causing the context restore function to be > >> called at probe time for this bank and because the context had never been saved, > >> was restoring an invalid state. This ultimately resulted in a crash [1]. > >> > >> There are multiple bugs here that need to be addressed ... > >> > >> 1. Why the always-on power domain returns a context loss count of 1? This needs > >> to be fixed in the power domain code. However, the gpio driver should not > >> assume the loss count is 0 to begin with. > >> 2. The omap gpio driver should never be calling get_context_loss_count for a > >> gpio bank in a always-on domain. This is pointless and adds unneccessary > >> overhead. > >> 3. The OMAP gpio driver assumes that the initial power domain context loss count > >> will be 0 at the time the gpio driver is probed. However, it could be > >> possible that this is not the case and an invalid context restore could be > >> performed during the probe. To avoid this otherwise only populated the > > > > The 'To avoid this...' sentence here doesn't read well. Looks like you > > need to: > > > > s/otherwise// > > Yes, I meant to have dropped "otherwise" here. Thanks! > > > s/populated/populate/ > > Yes that too! I must have re-worded and screwed it up royally :-( > > > ? > > > >> get_context_loss_count() function pointer after the initial call to > >> pm_runtime_get() has occurred. This will ensure that the first > >> pm_runtime_put() initialised the loss count correctly. > >> > >> This patch addresses issues 2 and 3 above. > >> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 > >> > >> Cc: Grant Likely <grant.likely@secretlab.ca> > >> Cc: Linus Walleij <linus.walleij@stericsson.com> > >> Cc: Kevin Hilman <khilman@ti.com> > >> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> > >> Cc: Franky Lin <frankyl@broadcom.com> > >> > >> Reported-by: Franky Lin <frankyl@broadcom.com> > >> Signed-off-by: Jon Hunter <jon-hunter@ti.com> > > > > Thanks for digging inot this bug Jon. The same bug was brought up by > > Neil Brown (Cc'd) in a different thread. > > > > Neil, it looks to me that this fix will address the problems you were > > seeing as well. Care to test, and respond with your ack/tested-by if it > > works for you? Thanks. > > Neil let me know your thoughts and if you are ok, I can clean-up the > changelog and re-send. Yes, works for me and looks sensible. Tested-by: NeilBrown <neilb@suse.de> Thanks, NeilBrown
NeilBrown <neilb@suse.de> writes: > On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote: > >> >> On 07/02/2012 01:07 PM, Kevin Hilman wrote: >> > + Neil Brown >> > >> > Hi Jon, >> > >> > Jon Hunter <jon-hunter@ti.com> writes: >> > >> >> Currently the gpio _runtime_resume/suspend functions are calling the >> >> get_context_loss_count() platform function if the function is populated for >> >> a gpio bank. This function is used to determine if the gpio bank logic state >> >> needs to be restored due to a power transition. This function will be populated >> >> for all banks, but it should only be called for banks that have the >> >> "loses_context" variable set. It is pointless to call this if loses_context is >> >> false as we know the context will never be lost and will not need restoring. >> >> >> >> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will >> >> never lose context. We found that the get_context_loss_count() was being called >> >> for bank-0 during the probe and returning 1 instead of 0 indicating that the >> >> context had been lost. This was causing the context restore function to be >> >> called at probe time for this bank and because the context had never been saved, >> >> was restoring an invalid state. This ultimately resulted in a crash [1]. >> >> >> >> There are multiple bugs here that need to be addressed ... >> >> >> >> 1. Why the always-on power domain returns a context loss count of 1? This needs >> >> to be fixed in the power domain code. However, the gpio driver should not >> >> assume the loss count is 0 to begin with. >> >> 2. The omap gpio driver should never be calling get_context_loss_count for a >> >> gpio bank in a always-on domain. This is pointless and adds unneccessary >> >> overhead. >> >> 3. The OMAP gpio driver assumes that the initial power domain context loss count >> >> will be 0 at the time the gpio driver is probed. However, it could be >> >> possible that this is not the case and an invalid context restore could be >> >> performed during the probe. To avoid this otherwise only populated the >> > >> > The 'To avoid this...' sentence here doesn't read well. Looks like you >> > need to: >> > >> > s/otherwise// >> >> Yes, I meant to have dropped "otherwise" here. Thanks! >> >> > s/populated/populate/ >> >> Yes that too! I must have re-worded and screwed it up royally :-( >> >> > ? >> > >> >> get_context_loss_count() function pointer after the initial call to >> >> pm_runtime_get() has occurred. This will ensure that the first >> >> pm_runtime_put() initialised the loss count correctly. >> >> >> >> This patch addresses issues 2 and 3 above. >> >> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 >> >> >> >> Cc: Grant Likely <grant.likely@secretlab.ca> >> >> Cc: Linus Walleij <linus.walleij@stericsson.com> >> >> Cc: Kevin Hilman <khilman@ti.com> >> >> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> >> >> Cc: Franky Lin <frankyl@broadcom.com> >> >> >> >> Reported-by: Franky Lin <frankyl@broadcom.com> >> >> Signed-off-by: Jon Hunter <jon-hunter@ti.com> >> > >> > Thanks for digging inot this bug Jon. The same bug was brought up by >> > Neil Brown (Cc'd) in a different thread. >> > >> > Neil, it looks to me that this fix will address the problems you were >> > seeing as well. Care to test, and respond with your ack/tested-by if it >> > works for you? Thanks. >> >> Neil let me know your thoughts and if you are ok, I can clean-up the >> changelog and re-send. > > Yes, works for me and looks sensible. > > Tested-by: NeilBrown <neilb@suse.de> > Great! Thanks for testing. Jon, please make the minor changelog edits, collect the reviewed-by and tested-by tags and repost. I'll then queue this up for Grant. Based on your earlier comments, this only affects v3.5, so no need to push it into stable, correct? Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/02/2012 07:05 PM, Kevin Hilman wrote: > NeilBrown <neilb@suse.de> writes: > >> On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote: >> >>> >>> On 07/02/2012 01:07 PM, Kevin Hilman wrote: >>>> + Neil Brown >>>> >>>> Hi Jon, >>>> >>>> Jon Hunter <jon-hunter@ti.com> writes: >>>> >>>>> Currently the gpio _runtime_resume/suspend functions are calling the >>>>> get_context_loss_count() platform function if the function is populated for >>>>> a gpio bank. This function is used to determine if the gpio bank logic state >>>>> needs to be restored due to a power transition. This function will be populated >>>>> for all banks, but it should only be called for banks that have the >>>>> "loses_context" variable set. It is pointless to call this if loses_context is >>>>> false as we know the context will never be lost and will not need restoring. >>>>> >>>>> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will >>>>> never lose context. We found that the get_context_loss_count() was being called >>>>> for bank-0 during the probe and returning 1 instead of 0 indicating that the >>>>> context had been lost. This was causing the context restore function to be >>>>> called at probe time for this bank and because the context had never been saved, >>>>> was restoring an invalid state. This ultimately resulted in a crash [1]. >>>>> >>>>> There are multiple bugs here that need to be addressed ... >>>>> >>>>> 1. Why the always-on power domain returns a context loss count of 1? This needs >>>>> to be fixed in the power domain code. However, the gpio driver should not >>>>> assume the loss count is 0 to begin with. >>>>> 2. The omap gpio driver should never be calling get_context_loss_count for a >>>>> gpio bank in a always-on domain. This is pointless and adds unneccessary >>>>> overhead. >>>>> 3. The OMAP gpio driver assumes that the initial power domain context loss count >>>>> will be 0 at the time the gpio driver is probed. However, it could be >>>>> possible that this is not the case and an invalid context restore could be >>>>> performed during the probe. To avoid this otherwise only populated the >>>> >>>> The 'To avoid this...' sentence here doesn't read well. Looks like you >>>> need to: >>>> >>>> s/otherwise// >>> >>> Yes, I meant to have dropped "otherwise" here. Thanks! >>> >>>> s/populated/populate/ >>> >>> Yes that too! I must have re-worded and screwed it up royally :-( >>> >>>> ? >>>> >>>>> get_context_loss_count() function pointer after the initial call to >>>>> pm_runtime_get() has occurred. This will ensure that the first >>>>> pm_runtime_put() initialised the loss count correctly. >>>>> >>>>> This patch addresses issues 2 and 3 above. >>>>> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 >>>>> >>>>> Cc: Grant Likely <grant.likely@secretlab.ca> >>>>> Cc: Linus Walleij <linus.walleij@stericsson.com> >>>>> Cc: Kevin Hilman <khilman@ti.com> >>>>> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> >>>>> Cc: Franky Lin <frankyl@broadcom.com> >>>>> >>>>> Reported-by: Franky Lin <frankyl@broadcom.com> >>>>> Signed-off-by: Jon Hunter <jon-hunter@ti.com> >>>> >>>> Thanks for digging inot this bug Jon. The same bug was brought up by >>>> Neil Brown (Cc'd) in a different thread. >>>> >>>> Neil, it looks to me that this fix will address the problems you were >>>> seeing as well. Care to test, and respond with your ack/tested-by if it >>>> works for you? Thanks. >>> >>> Neil let me know your thoughts and if you are ok, I can clean-up the >>> changelog and re-send. >> >> Yes, works for me and looks sensible. >> >> Tested-by: NeilBrown <neilb@suse.de> >> > > Great! Thanks for testing. > > Jon, please make the minor changelog edits, collect the reviewed-by and > tested-by tags and repost. I'll then queue this up for Grant. Ok, will do that tomorrow. > Based on your earlier comments, this only affects v3.5, so no > need to push it into stable, correct? As far as I can tell. However, not sure if any of the other fixes should be back ported. Cheers Jon -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index c4ed172..f13fc9c 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -1081,7 +1081,6 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) bank->is_mpuio = pdata->is_mpuio; bank->non_wakeup_gpios = pdata->non_wakeup_gpios; bank->loses_context = pdata->loses_context; - bank->get_context_loss_count = pdata->get_context_loss_count; bank->regs = pdata->regs; #ifdef CONFIG_OF_GPIO bank->chip.of_node = of_node_get(node); @@ -1135,6 +1134,9 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) omap_gpio_chip_init(bank); omap_gpio_show_rev(bank); + if (bank->loses_context) + bank->get_context_loss_count = pdata->get_context_loss_count; + pm_runtime_put(bank->dev); list_add_tail(&bank->node, &omap_gpio_list);
Currently the gpio _runtime_resume/suspend functions are calling the get_context_loss_count() platform function if the function is populated for a gpio bank. This function is used to determine if the gpio bank logic state needs to be restored due to a power transition. This function will be populated for all banks, but it should only be called for banks that have the "loses_context" variable set. It is pointless to call this if loses_context is false as we know the context will never be lost and will not need restoring. For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will never lose context. We found that the get_context_loss_count() was being called for bank-0 during the probe and returning 1 instead of 0 indicating that the context had been lost. This was causing the context restore function to be called at probe time for this bank and because the context had never been saved, was restoring an invalid state. This ultimately resulted in a crash [1]. There are multiple bugs here that need to be addressed ... 1. Why the always-on power domain returns a context loss count of 1? This needs to be fixed in the power domain code. However, the gpio driver should not assume the loss count is 0 to begin with. 2. The omap gpio driver should never be calling get_context_loss_count for a gpio bank in a always-on domain. This is pointless and adds unneccessary overhead. 3. The OMAP gpio driver assumes that the initial power domain context loss count will be 0 at the time the gpio driver is probed. However, it could be possible that this is not the case and an invalid context restore could be performed during the probe. To avoid this otherwise only populated the get_context_loss_count() function pointer after the initial call to pm_runtime_get() has occurred. This will ensure that the first pm_runtime_put() initialised the loss count correctly. This patch addresses issues 2 and 3 above. [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2 Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Kevin Hilman <khilman@ti.com> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com> Cc: Franky Lin <frankyl@broadcom.com> Reported-by: Franky Lin <frankyl@broadcom.com> Signed-off-by: Jon Hunter <jon-hunter@ti.com> --- drivers/gpio/gpio-omap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)