Message ID | 20141109105337.4952.36899.stgit@zurg (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote: > ACPI maintains cache of ioremap regions to speed up operations and > access to them from irq context where ioremap() calls aren't allowed. > This code abuses synchronize_rcu() on unmap path for synchronization > with fast-path in acpi_os_read/write_memory which uses this cache. > > Since v3.10 CPUs are allowed to enter idle state even if they have RCU > callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 > ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks"). > That change caused problems with nvidia proprietary driver which calls > acpi_os_map/unmap_generic_address several times during initialization. > Each unmap calls synchronize_rcu and adds significant delay. Totally > initialization is slowed for a couple of seconds and that is enough to > trigger timeout in hardware, gpu decides to "fell off the bus". Widely > spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy. > > This patch replaces synchronize_rcu() with synchronize_rcu_expedited() > which is much faster. > > Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> > Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com> > Cc: Tom Boshoven <tomboshoven@gmail.com> > Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/ Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > --- > drivers/acpi/osl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > index 9964f70..217713c 100644 > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map) > static void acpi_os_map_cleanup(struct acpi_ioremap *map) > { > if (!map->refcount) { > - synchronize_rcu(); > + synchronize_rcu_expedited(); > acpi_unmap(map->phys, map->virt); > kfree(map); > } > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote: > ACPI maintains cache of ioremap regions to speed up operations and > access to them from irq context where ioremap() calls aren't allowed. > This code abuses synchronize_rcu() on unmap path for synchronization > with fast-path in acpi_os_read/write_memory which uses this cache. > > Since v3.10 CPUs are allowed to enter idle state even if they have RCU > callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 > ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks"). > That change caused problems with nvidia proprietary driver which calls > acpi_os_map/unmap_generic_address several times during initialization. > Each unmap calls synchronize_rcu and adds significant delay. Totally > initialization is slowed for a couple of seconds and that is enough to > trigger timeout in hardware, gpu decides to "fell off the bus". Widely > spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy. > > This patch replaces synchronize_rcu() with synchronize_rcu_expedited() > which is much faster. > > Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> > Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com> > Cc: Tom Boshoven <tomboshoven@gmail.com> > Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/ Please feel free to add Tested-by: Tested-by: Lee, Chun-Yi <jlee@suse.com> This patch fixed the performance issue on VMWare workstation 10.0.2 with the virtual machine that has more than 2 CPU and 4G memory: Mware workstation 10.0.2 BIOS DMI: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 vCPU = 8 vMEM = 4G mem.hotplug=TRUE physical CPUs on host machine: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz * 24 I tested this patch with v3.12, v3.17, v3.18-rc4 mainline kernel, those kernel call can produced issue and all got speedup when acpi initial. I suggest this patch go to stable kernel patch fixing. Thanks a lot! Joey Lee > --- > drivers/acpi/osl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > index 9964f70..217713c 100644 > --- a/drivers/acpi/osl.c > +++ b/drivers/acpi/osl.c > @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map) > static void acpi_os_map_cleanup(struct acpi_ioremap *map) > { > if (!map->refcount) { > - synchronize_rcu(); > + synchronize_rcu_expedited(); > acpi_unmap(map->phys, map->virt); > kfree(map); > } > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, November 09, 2014 02:00:38 PM Paul E. McKenney wrote: > On Sun, Nov 09, 2014 at 01:53:37PM +0400, Konstantin Khlebnikov wrote: > > ACPI maintains cache of ioremap regions to speed up operations and > > access to them from irq context where ioremap() calls aren't allowed. > > This code abuses synchronize_rcu() on unmap path for synchronization > > with fast-path in acpi_os_read/write_memory which uses this cache. > > > > Since v3.10 CPUs are allowed to enter idle state even if they have RCU > > callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 > > ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks"). > > That change caused problems with nvidia proprietary driver which calls > > acpi_os_map/unmap_generic_address several times during initialization. > > Each unmap calls synchronize_rcu and adds significant delay. Totally > > initialization is slowed for a couple of seconds and that is enough to > > trigger timeout in hardware, gpu decides to "fell off the bus". Widely > > spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy. > > > > This patch replaces synchronize_rcu() with synchronize_rcu_expedited() > > which is much faster. > > > > Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> > > Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com> > > Cc: Tom Boshoven <tomboshoven@gmail.com> > > Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/ > > Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Patch queued up for 3.19, thanks! > > > --- > > drivers/acpi/osl.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c > > index 9964f70..217713c 100644 > > --- a/drivers/acpi/osl.c > > +++ b/drivers/acpi/osl.c > > @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map) > > static void acpi_os_map_cleanup(struct acpi_ioremap *map) > > { > > if (!map->refcount) { > > - synchronize_rcu(); > > + synchronize_rcu_expedited(); > > acpi_unmap(map->phys, map->virt); > > kfree(map); > > } > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index 9964f70..217713c 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -436,7 +436,7 @@ static void acpi_os_drop_map_ref(struct acpi_ioremap *map) static void acpi_os_map_cleanup(struct acpi_ioremap *map) { if (!map->refcount) { - synchronize_rcu(); + synchronize_rcu_expedited(); acpi_unmap(map->phys, map->virt); kfree(map); }
ACPI maintains cache of ioremap regions to speed up operations and access to them from irq context where ioremap() calls aren't allowed. This code abuses synchronize_rcu() on unmap path for synchronization with fast-path in acpi_os_read/write_memory which uses this cache. Since v3.10 CPUs are allowed to enter idle state even if they have RCU callbacks queued, see commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 ("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks"). That change caused problems with nvidia proprietary driver which calls acpi_os_map/unmap_generic_address several times during initialization. Each unmap calls synchronize_rcu and adds significant delay. Totally initialization is slowed for a couple of seconds and that is enough to trigger timeout in hardware, gpu decides to "fell off the bus". Widely spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy. This patch replaces synchronize_rcu() with synchronize_rcu_expedited() which is much faster. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com> Cc: Tom Boshoven <tomboshoven@gmail.com> Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/ --- drivers/acpi/osl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html