Message ID | 20151203172708.GT8644@n2100.arm.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 3 Dec 2015, Russell King - ARM Linux wrote: > On Thu, Dec 03, 2015 at 04:41:18PM +0000, Russell King - ARM Linux wrote: > > On Thu, Dec 03, 2015 at 04:12:06PM +0000, Peter Rosin wrote: > > > * uaccess_with_memcpy.c:__copy_to_user() has a mode in which it copies > > > "non-atomically" (if faulthandler_disabled() returns 0). If a fault > > > happens during __copy_to_user, what prevents some other thread from > > > clobbering DACR? > > > > See the second point above. Moreover, if we sleep in down_read(), > > then __switch_to() reads the current DACR value and saves it in the > > thread information, and will restore that value when resuming the > > thread - even if the thread has been migrated to a different CPU. > > I thought this was correct, but it isn't - that's what my original solution > did, but I think when Will reviewed it, we decided it wasn't necessary - > and it isn't necessary for every single case with the exception of this > one. This is exactly what's going wrong: the down_read() in these paths > calls into the scheduler, which switches away. When we come back, the > DACR value is reset by the other thread to 0x51. > > There's a few ways to solve this: > > 1. Make the thread switching code save and restore the DACR register as > it would do for domains. This imposes an overhead on every single > context switch whether or not we happen to be in this _single_ > troublesome code. (Patch attached - as there's several, I'm attaching > them.) > > 2. Add additional code to the uaccess-with-memcpy stuff to reset the > DACR value prior to using memcpy() or memset(). (Patch attached.) > > 3. Make uaccess-with-memcpy depend on !CPU_SW_DOMAINS_PAN (suggested by > Will) > > 4. Delete the uaccess-with-memcpy code (also suggested by Will.) > > I think the best thing I can do is say... "Discuss amongst yourselves" :) Personally, I'd advocate for #2 or #4. Prior commit 0f64b247e6 I was already leaning towards #4. So if some people are still relying on uaccess-with-memcpy and #2 fixes it then it's all good. I'd suggest surrounding the DACR accesses with #ifdef CONFIG_CPU_SW_DOMAIN_PAN in the final patch. Nicolas
Russell King wrote: > On Thu, Dec 03, 2015 at 04:41:18PM +0000, Russell King - ARM Linux wrote: > > On Thu, Dec 03, 2015 at 04:12:06PM +0000, Peter Rosin wrote: > > > * uaccess_with_memcpy.c:__copy_to_user() has a mode in which it copies > > > "non-atomically" (if faulthandler_disabled() returns 0). If a fault > > > happens during __copy_to_user, what prevents some other thread from > > > clobbering DACR? > > > > See the second point above. Moreover, if we sleep in down_read(), > > then __switch_to() reads the current DACR value and saves it in the > > thread information, and will restore that value when resuming the > > thread - even if the thread has been migrated to a different CPU. > > I thought this was correct, but it isn't - that's what my original solution > did, but I think when Will reviewed it, we decided it wasn't necessary - > and it isn't necessary for every single case with the exception of this > one. This is exactly what's going wrong: the down_read() in these paths > calls into the scheduler, which switches away. When we come back, the > DACR value is reset by the other thread to 0x51. > > There's a few ways to solve this: > > 1. Make the thread switching code save and restore the DACR register as > it would do for domains. This imposes an overhead on every single > context switch whether or not we happen to be in this _single_ > troublesome code. (Patch attached - as there's several, I'm attaching > them.) > > 2. Add additional code to the uaccess-with-memcpy stuff to reset the > DACR value prior to using memcpy() or memset(). (Patch attached.) I took both patches for a quick spin (a dozen boots and one hour uptime after that for each patch) and no incidents. I have not gathered data, but the crash on boot feels like it's quite a bit above 50% when there is a problem so this feels good (I used 5 clean reboots when I bisected and that worked). Reported-by: Peter Rosin <peda@axentia.se> Tested-by: Peter Rosin <peda@axentia.se> (and please don't forget to cc stable) > 3. Make uaccess-with-memcpy depend on !CPU_SW_DOMAINS_PAN (suggested by > Will) > > 4. Delete the uaccess-with-memcpy code (also suggested by Will.) > > I think the best thing I can do is say... "Discuss amongst yourselves" :) I have no personal preference on what should be done, I only had copy_to_user_memcpy enabled since that was what Atmel fed me. Cheers, Peter
On Thu, Dec 03, 2015 at 01:28:12PM -0500, Nicolas Pitre wrote: > On Thu, 3 Dec 2015, Russell King - ARM Linux wrote: > > > On Thu, Dec 03, 2015 at 04:41:18PM +0000, Russell King - ARM Linux wrote: > > > On Thu, Dec 03, 2015 at 04:12:06PM +0000, Peter Rosin wrote: > > > > * uaccess_with_memcpy.c:__copy_to_user() has a mode in which it copies > > > > "non-atomically" (if faulthandler_disabled() returns 0). If a fault > > > > happens during __copy_to_user, what prevents some other thread from > > > > clobbering DACR? > > > > > > See the second point above. Moreover, if we sleep in down_read(), > > > then __switch_to() reads the current DACR value and saves it in the > > > thread information, and will restore that value when resuming the > > > thread - even if the thread has been migrated to a different CPU. > > > > I thought this was correct, but it isn't - that's what my original solution > > did, but I think when Will reviewed it, we decided it wasn't necessary - > > and it isn't necessary for every single case with the exception of this > > one. This is exactly what's going wrong: the down_read() in these paths > > calls into the scheduler, which switches away. When we come back, the > > DACR value is reset by the other thread to 0x51. > > > > There's a few ways to solve this: > > > > 1. Make the thread switching code save and restore the DACR register as > > it would do for domains. This imposes an overhead on every single > > context switch whether or not we happen to be in this _single_ > > troublesome code. (Patch attached - as there's several, I'm attaching > > them.) > > > > 2. Add additional code to the uaccess-with-memcpy stuff to reset the > > DACR value prior to using memcpy() or memset(). (Patch attached.) > > > > 3. Make uaccess-with-memcpy depend on !CPU_SW_DOMAINS_PAN (suggested by > > Will) > > > > 4. Delete the uaccess-with-memcpy code (also suggested by Will.) > > > > I think the best thing I can do is say... "Discuss amongst yourselves" :) > > Personally, I'd advocate for #2 or #4. Prior commit 0f64b247e6 I was > already leaning towards #4. > > So if some people are still relying on uaccess-with-memcpy and #2 fixes > it then it's all good. I'd suggest surrounding the DACR accesses with > #ifdef CONFIG_CPU_SW_DOMAIN_PAN in the final patch. Last time this was discussed, people were unhappy about removing it as they were seeing performance advantages with it enabled. Of course, that was with it being buggy. I think at this point, short of deleting it, I'd opt for (2) so the overhead is attributed to the appropriate place, and not spread across the entire system. That should then be the prelude for another round of performance evaluation to see whether it is still advantageous to have the uaccess-with-memcpy code before making a final decision whether to revert (2) and apply (3) instead, or to go with (4).
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 3ce377f7251f..ae8a3ad763d9 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -782,7 +782,7 @@ ENTRY(__switch_to) THUMB( str lr, [ip], #4 ) ldr r4, [r2, #TI_TP_VALUE] ldr r5, [r2, #TI_TP_VALUE + 4] -#ifdef CONFIG_CPU_USE_DOMAINS +#if defined(CONFIG_CPU_USE_DOMAINS) || defined(CONFIG_CPU_SW_DOMAIN_PAN) mrc p15, 0, r6, c3, c0, 0 @ Get domain register str r6, [r1, #TI_CPU_DOMAIN] @ Save old domain register ldr r6, [r2, #TI_CPU_DOMAIN] @@ -793,7 +793,7 @@ ENTRY(__switch_to) ldr r8, =__stack_chk_guard ldr r7, [r7, #TSK_STACK_CANARY] #endif -#ifdef CONFIG_CPU_USE_DOMAINS +#if defined(CONFIG_CPU_USE_DOMAINS) || defined(CONFIG_CPU_SW_DOMAIN_PAN) mcr p15, 0, r6, c3, c0, 0 @ Set domain register #endif mov r5, r0 diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 4adfb46e3ee9..9d80eb20488f 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -229,7 +229,7 @@ copy_thread(unsigned long clone_flags, unsigned long stack_start, memset(&thread->cpu_context, 0, sizeof(struct cpu_context_save)); -#ifdef CONFIG_CPU_USE_DOMAINS +#if defined(CONFIG_CPU_USE_DOMAINS) || defined(CONFIG_CPU_SW_DOMAIN_PAN) /* * Copy the initial value of the domain access control register * from the current thread: thread->addr_limit will have been