diff mbox

[v2] ACPI/osl: speedup grace period in acpi_os_map_cleanup

Message ID 20141109105337.4952.36899.stgit@zurg (mailing list archive)
State Accepted, archived
Headers show

Commit Message

Konstantin Khlebnikov Nov. 9, 2014, 9:53 a.m. UTC
ACPI maintains cache of ioremap regions to speed up operations and
access to them from irq context where ioremap() calls aren't allowed.
This code abuses synchronize_rcu() on unmap path for synchronization
with fast-path in acpi_os_read/write_memory which uses this cache.

Since v3.10 CPUs are allowed to enter idle state even if they have RCU
callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66
("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
That change caused problems with nvidia proprietary driver which calls
acpi_os_map/unmap_generic_address several times during initialization.
Each unmap calls synchronize_rcu and adds significant delay. Totally
initialization is slowed for a couple of seconds and that is enough to
trigger timeout in hardware, gpu decides to "fell off the bus". Widely
spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.

This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
which is much faster.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com>
Cc: Tom Boshoven <tomboshoven@gmail.com>
Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/
---
 drivers/acpi/osl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paul E. McKenney Nov. 9, 2014, 10 p.m. UTC | #1
On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote:
> ACPI maintains cache of ioremap regions to speed up operations and
> access to them from irq context where ioremap() calls aren't allowed.
> This code abuses synchronize_rcu() on unmap path for synchronization
> with fast-path in acpi_os_read/write_memory which uses this cache.
> 
> Since v3.10 CPUs are allowed to enter idle state even if they have RCU
> callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66
> ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
> That change caused problems with nvidia proprietary driver which calls
> acpi_os_map/unmap_generic_address several times during initialization.
> Each unmap calls synchronize_rcu and adds significant delay. Totally
> initialization is slowed for a couple of seconds and that is enough to
> trigger timeout in hardware, gpu decides to "fell off the bus". Widely
> spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.
> 
> This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
> which is much faster.
> 
> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com>
> Cc: Tom Boshoven <tomboshoven@gmail.com>
> Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  drivers/acpi/osl.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 9964f70..217713c 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
>  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
>  {
>  	if (!map->refcount) {
> -		synchronize_rcu();
> +		synchronize_rcu_expedited();
>  		acpi_unmap(map->phys, map->virt);
>  		kfree(map);
>  	}
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
joeyli Nov. 14, 2014, 3:52 p.m. UTC | #2
Hi, 

On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote:
> ACPI maintains cache of ioremap regions to speed up operations and
> access to them from irq context where ioremap() calls aren't allowed.
> This code abuses synchronize_rcu() on unmap path for synchronization
> with fast-path in acpi_os_read/write_memory which uses this cache.
> 
> Since v3.10 CPUs are allowed to enter idle state even if they have RCU
> callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66
> ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
> That change caused problems with nvidia proprietary driver which calls
> acpi_os_map/unmap_generic_address several times during initialization.
> Each unmap calls synchronize_rcu and adds significant delay. Totally
> initialization is slowed for a couple of seconds and that is enough to
> trigger timeout in hardware, gpu decides to "fell off the bus". Widely
> spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.
> 
> This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
> which is much faster.
> 
> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com>
> Cc: Tom Boshoven <tomboshoven@gmail.com>
> Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/

Please feel free to add Tested-by:

Tested-by: Lee, Chun-Yi <jlee@suse.com>


This patch fixed the performance issue on VMWare workstation 10.0.2 with the
virtual machine that has more than 2 CPU and 4G memory:

Mware workstation 10.0.2 
  BIOS DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
  vCPU = 8
  vMEM = 4G
  mem.hotplug=TRUE

physical CPUs on host machine: Intel(R) Xeon(R) CPU X5670  @ 2.93GHz  * 24

I tested this patch with v3.12, v3.17, v3.18-rc4 mainline kernel, those kernel
call can produced issue and all got speedup when acpi initial. I suggest this
patch go to stable kernel patch fixing.


Thanks a lot!
Joey Lee

> ---
>  drivers/acpi/osl.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 9964f70..217713c 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
>  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
>  {
>  	if (!map->refcount) {
> -		synchronize_rcu();
> +		synchronize_rcu_expedited();
>  		acpi_unmap(map->phys, map->virt);
>  		kfree(map);
>  	}
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 14, 2014, 10:48 p.m. UTC | #3
On Sunday, November 09, 2014 02:00:38 PM Paul E. McKenney wrote:
> On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote:
> > ACPI maintains cache of ioremap regions to speed up operations and
> > access to them from irq context where ioremap() calls aren't allowed.
> > This code abuses synchronize_rcu() on unmap path for synchronization
> > with fast-path in acpi_os_read/write_memory which uses this cache.
> > 
> > Since v3.10 CPUs are allowed to enter idle state even if they have RCU
> > callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66
> > ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
> > That change caused problems with nvidia proprietary driver which calls
> > acpi_os_map/unmap_generic_address several times during initialization.
> > Each unmap calls synchronize_rcu and adds significant delay. Totally
> > initialization is slowed for a couple of seconds and that is enough to
> > trigger timeout in hardware, gpu decides to "fell off the bus". Widely
> > spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.
> > 
> > This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
> > which is much faster.
> > 
> > Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> > Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com>
> > Cc: Tom Boshoven <tomboshoven@gmail.com>
> > Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/
> 
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Patch queued up for 3.19, thanks!

> 
> > ---
> >  drivers/acpi/osl.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> > index 9964f70..217713c 100644
> > --- a/drivers/acpi/osl.c
> > +++ b/drivers/acpi/osl.c
> > @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
> >  static void acpi_os_map_cleanup(struct acpi_ioremap *map)
> >  {
> >  	if (!map->refcount) {
> > -		synchronize_rcu();
> > +		synchronize_rcu_expedited();
> >  		acpi_unmap(map->phys, map->virt);
> >  		kfree(map);
> >  	}
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 9964f70..217713c 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -436,7 +436,7 @@  static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
 static void acpi_os_map_cleanup(struct acpi_ioremap *map)
 {
 	if (!map->refcount) {
-		synchronize_rcu();
+		synchronize_rcu_expedited();
 		acpi_unmap(map->phys, map->virt);
 		kfree(map);
 	}