Message ID | 20120712150623.06b2f71e.akpm@linux-foundation.org (mailing list archive) |
---|---|
State | Accepted |
Commit | f19b9f74b7ea3b21ddcee55d852a6488239608a4 |
Headers | show |
2012/7/13 Andrew Morton <akpm@linux-foundation.org>: > On Thu, 12 Jul 2012 20:04:53 +0900 > Akinobu Mita <akinobu.mita@gmail.com> wrote: > >> The function dup_task() may fail at the following function calls in >> the following order. >> >> 0) alloc_task_struct_node() >> 1) alloc_thread_info_node() >> 2) arch_dup_task_struct() >> >> Error by 0) is not a matter, it can just return. But error by 1) >> requires releasing task_struct allocated by 0) before it returns. >> Likewise, error by 2) requires releasing task_struct and thread_info >> allocated by 0) and 1). >> >> The existing error handling calls free_task_struct() and >> free_thread_info() which do not only release task_struct and >> thread_info, but also call architecture specific >> arch_release_task_struct() and arch_release_thread_info(). >> >> The problem is that task_struct and thread_info are not fully >> initialized yet at this point, but arch_release_task_struct() and >> arch_release_thread_info() are called with them. >> >> For example, x86 defines its own arch_release_task_struct() that >> releases a task_xstate. If alloc_thread_info_node() fails in >> dup_task(), arch_release_task_struct() is called with task_struct >> which is just allocated and filled with garbage in this error handling. >> >> This actually happened with tools/testing/fault-injection/failcmd.sh >> >> # env FAILCMD_TYPE=fail_page_alloc \ >> ./tools/testing/fault-injection/failcmd.sh --times=100 \ >> --min-order=0 --ignore-gfp-wait=0 \ >> -- make -C tools/testing/selftests/ run_tests >> >> In order to fix this issue, make free_{task_struct,thread_info}() not >> to call arch_release_{task_struct,thread_info}() and call >> arch_release_{task_struct,thread_info}() implicitly where needed. >> >> Default arch_release_task_struct() and arch_release_thread_info() are >> defined as empty by default. So this change only affects the >> architectures which implement their own arch_release_task_struct() or >> arch_release_thread_info() as listed below. > > This conflicts with Salman's fix (below) which is in linux-next via > Ingo's tree. > > It appears that we should drop Salman's patch altogether and use yours? Yes. Salman's patch fixes error handling for x86, sh, and mn10300. But it doesn't fix for tile. If tile's arch_release_thread_info() is called after setup_thread_stack(tsk, orig), it may release original task's step_state. (tsk->step_state will be cleared by tile's copy_thread() after dup_task_struct()). -- To unsubscribe from this list: send the line "unsubscribe linux-sh" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/13/2012 6:07 AM, Akinobu Mita wrote: > 2012/7/13 Andrew Morton <akpm@linux-foundation.org>: >> On Thu, 12 Jul 2012 20:04:53 +0900 >> Akinobu Mita <akinobu.mita@gmail.com> wrote: >> >>> The function dup_task() may fail at the following function calls in >>> the following order. >>> >>> 0) alloc_task_struct_node() >>> 1) alloc_thread_info_node() >>> 2) arch_dup_task_struct() >>> >>> Error by 0) is not a matter, it can just return. But error by 1) >>> requires releasing task_struct allocated by 0) before it returns. >>> Likewise, error by 2) requires releasing task_struct and thread_info >>> allocated by 0) and 1). >>> >>> The existing error handling calls free_task_struct() and >>> free_thread_info() which do not only release task_struct and >>> thread_info, but also call architecture specific >>> arch_release_task_struct() and arch_release_thread_info(). >>> >>> The problem is that task_struct and thread_info are not fully >>> initialized yet at this point, but arch_release_task_struct() and >>> arch_release_thread_info() are called with them. >>> >>> For example, x86 defines its own arch_release_task_struct() that >>> releases a task_xstate. If alloc_thread_info_node() fails in >>> dup_task(), arch_release_task_struct() is called with task_struct >>> which is just allocated and filled with garbage in this error handling. >>> >>> This actually happened with tools/testing/fault-injection/failcmd.sh >>> >>> # env FAILCMD_TYPE=fail_page_alloc \ >>> ./tools/testing/fault-injection/failcmd.sh --times=100 \ >>> --min-order=0 --ignore-gfp-wait=0 \ >>> -- make -C tools/testing/selftests/ run_tests >>> >>> In order to fix this issue, make free_{task_struct,thread_info}() not >>> to call arch_release_{task_struct,thread_info}() and call >>> arch_release_{task_struct,thread_info}() implicitly where needed. >>> >>> Default arch_release_task_struct() and arch_release_thread_info() are >>> defined as empty by default. So this change only affects the >>> architectures which implement their own arch_release_task_struct() or >>> arch_release_thread_info() as listed below. >> This conflicts with Salman's fix (below) which is in linux-next via >> Ingo's tree. >> >> It appears that we should drop Salman's patch altogether and use yours? > Yes. Salman's patch fixes error handling for x86, sh, and mn10300. > But it doesn't fix for tile. If tile's arch_release_thread_info() is > called after setup_thread_stack(tsk, orig), it may release original > task's step_state. (tsk->step_state will be cleared by tile's > copy_thread() after dup_task_struct()). Yes, I think Akinobu's patch is better. Thanks.
diff --git a/kernel/fork.c b/kernel/fork.c index ab5211b..f00e319 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -304,12 +304,17 @@ static struct task_struct *dup_task_struct(struct task_struct *orig) } err = arch_dup_task_struct(tsk, orig); - if (err) - goto out; + /* + * We defer looking at err, because we will need this setup + * for the clean up path to work correctly. + */ tsk->stack = ti; - setup_thread_stack(tsk, orig); + + if (err) + goto out; + clear_user_return_notifier(tsk); clear_tsk_need_resched(tsk); stackend = end_of_stack(tsk);