Message ID | 7123b2489cc5d1d5abb7bcf1364ca729cab3e6ca.1406604806.git.luto@amacapital.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/28, Andy Lutomirski wrote: > > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs) > { > long ret = 0; > > - user_exit(); > + /* > + * If TIF_NOHZ is set, we are required to call user_exit() before > + * doing anything that could touch RCU. > + */ > + if (test_thread_flag(TIF_NOHZ)) > + user_exit(); Personally I still think this change just adds more confusion, but I leave this to you and Frederic. It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we need to call user_exit() just because we enter the kernel. TIF_NOHZ is just the implementation detail which triggers this slow path. At least it should be correct, unless I am confused even more than I think. Oleg.
On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote: > On 07/28, Andy Lutomirski wrote: > > > > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs) > > { > > long ret = 0; > > > > - user_exit(); > > + /* > > + * If TIF_NOHZ is set, we are required to call user_exit() before > > + * doing anything that could touch RCU. > > + */ > > + if (test_thread_flag(TIF_NOHZ)) > > + user_exit(); > > Personally I still think this change just adds more confusion, but I leave > this to you and Frederic. > > It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we > need to call user_exit() just because we enter the kernel. TIF_NOHZ is just > the implementation detail which triggers this slow path. > > At least it should be correct, unless I am confused even more than I think. Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests this is only about tracing? syscall_slowpath_enter() could be an alternative. But that's still tracing in a general sense so...
On Wed, Jul 30, 2014 at 9:43 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote: >> On 07/28, Andy Lutomirski wrote: >> > >> > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs) >> > { >> > long ret = 0; >> > >> > - user_exit(); >> > + /* >> > + * If TIF_NOHZ is set, we are required to call user_exit() before >> > + * doing anything that could touch RCU. >> > + */ >> > + if (test_thread_flag(TIF_NOHZ)) >> > + user_exit(); >> >> Personally I still think this change just adds more confusion, but I leave >> this to you and Frederic. >> >> It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we >> need to call user_exit() just because we enter the kernel. TIF_NOHZ is just >> the implementation detail which triggers this slow path. >> >> At least it should be correct, unless I am confused even more than I think. > > Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests > this is only about tracing? syscall_slowpath_enter() could be an alternative. > But that's still tracing in a general sense so... At the end of the day, the syscall slowpath code calls a bunch of functions depending on what TIF_XYZ flags are set. As long as it's structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something like that, it's comprehensible. But once random functions with no explicit flag checks or comments start showing up, it gets confusing. If it's indeed all-or-nothing, I could remove the check and add a comment. But please keep in mind that, currently, the slow path is *slow*, and my patches only improve the entry case. So enabling context tracking on every task will hurt. --Andy
On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote: > On Wed, Jul 30, 2014 at 9:43 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > On Tue, Jul 29, 2014 at 09:32:32PM +0200, Oleg Nesterov wrote: > >> On 07/28, Andy Lutomirski wrote: > >> > > >> > @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs) > >> > { > >> > long ret = 0; > >> > > >> > - user_exit(); > >> > + /* > >> > + * If TIF_NOHZ is set, we are required to call user_exit() before > >> > + * doing anything that could touch RCU. > >> > + */ > >> > + if (test_thread_flag(TIF_NOHZ)) > >> > + user_exit(); > >> > >> Personally I still think this change just adds more confusion, but I leave > >> this to you and Frederic. > >> > >> It is not that "If TIF_NOHZ is set, we are required to call user_exit()", we > >> need to call user_exit() just because we enter the kernel. TIF_NOHZ is just > >> the implementation detail which triggers this slow path. > >> > >> At least it should be correct, unless I am confused even more than I think. > > > > Agreed, Perhaps the confusion is on the syscall_trace_enter() name which suggests > > this is only about tracing? syscall_slowpath_enter() could be an alternative. > > But that's still tracing in a general sense so... > > At the end of the day, the syscall slowpath code calls a bunch of > functions depending on what TIF_XYZ flags are set. As long as it's > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something > like that, it's comprehensible. But once random functions with no > explicit flag checks or comments start showing up, it gets confusing. Yeah that's a point. I don't mind much the TIF_NOHZ test if you like. > > If it's indeed all-or-nothing, I could remove the check and add a > comment. But please keep in mind that, currently, the slow path is > *slow*, and my patches only improve the entry case. So enabling > context tracking on every task will hurt. That's what we do anyway. I haven't found a safe way to enabled context tracking without tracking all CPUs. > > --Andy
On 07/31, Frederic Weisbecker wrote: > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote: > > > > At the end of the day, the syscall slowpath code calls a bunch of > > functions depending on what TIF_XYZ flags are set. As long as it's > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something > > like that, it's comprehensible. But once random functions with no > > explicit flag checks or comments start showing up, it gets confusing. > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like. And in my opinion if (work & TIF_XYZ) user_exit(); looks even more confusing. Because, once again, TIF_XYZ is not the reason to call user_exit(). Not to mention this adds a minor performance penalty. > > If it's indeed all-or-nothing, I could remove the check and add a > > comment. But please keep in mind that, currently, the slow path is > > *slow*, and my patches only improve the entry case. So enabling > > context tracking on every task will hurt. > > That's what we do anyway. I haven't found a safe way to enabled context tracking > without tracking all CPUs. And if we change this, then the code above becomes racy. The state of TIF_XYZ can be changed right after the check. OK, it is racy anyway ;) but still this adds more confusion. I feel that TIF_XYZ must die. But yes, yes, I know that it is very simple to say this. And no, so far I do not know how we can improve this all. But again, again, I won't insist. Just another "can't resist" email, please ignore. Oleg.
On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote: > On 07/31, Frederic Weisbecker wrote: > > > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote: > > > > > > At the end of the day, the syscall slowpath code calls a bunch of > > > functions depending on what TIF_XYZ flags are set. As long as it's > > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something > > > like that, it's comprehensible. But once random functions with no > > > explicit flag checks or comments start showing up, it gets confusing. > > > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like. > > And in my opinion > > if (work & TIF_XYZ) > user_exit(); > > looks even more confusing. Because, once again, TIF_XYZ is not the > reason to call user_exit(). > > Not to mention this adds a minor performance penalty. That's a point too! You guys both convinced me! ;-) > > > > If it's indeed all-or-nothing, I could remove the check and add a > > > comment. But please keep in mind that, currently, the slow path is > > > *slow*, and my patches only improve the entry case. So enabling > > > context tracking on every task will hurt. > > > > That's what we do anyway. I haven't found a safe way to enabled context tracking > > without tracking all CPUs. > > And if we change this, then the code above becomes racy. The state of > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;) > but still this adds more confusion. No because all running tasks have this flag set when context tracking is enabled. And context tracking can't be disabled on runtime.
On 07/31, Frederic Weisbecker wrote: > > On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote: > > On 07/31, Frederic Weisbecker wrote: > > > > > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote: > > > > > > > > At the end of the day, the syscall slowpath code calls a bunch of > > > > functions depending on what TIF_XYZ flags are set. As long as it's > > > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something > > > > like that, it's comprehensible. But once random functions with no > > > > explicit flag checks or comments start showing up, it gets confusing. > > > > > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like. > > > > And in my opinion > > > > if (work & TIF_XYZ) > > user_exit(); > > > > looks even more confusing. Because, once again, TIF_XYZ is not the > > reason to call user_exit(). > > > > Not to mention this adds a minor performance penalty. > > That's a point too! You guys both convinced me! ;-) Very nice, now I know that you can agree with 2 opposite opinions at the same time ;) > > > > If it's indeed all-or-nothing, I could remove the check and add a > > > > comment. But please keep in mind that, currently, the slow path is > > > > *slow*, and my patches only improve the entry case. So enabling > > > > context tracking on every task will hurt. > > > > > > That's what we do anyway. I haven't found a safe way to enabled context tracking > > > without tracking all CPUs. > > > > And if we change this, then the code above becomes racy. The state of > > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;) > > but still this adds more confusion. > > No because all running tasks have this flag set when context tracking is > enabled. And context tracking can't be disabled on runtime. Yes, yes, please note that I said "if we change this". Oleg.
On 07/31, Oleg Nesterov wrote: > > On 07/31, Frederic Weisbecker wrote: > > > > On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote: > > > > > > And if we change this, then the code above becomes racy. The state of > > > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;) > > > but still this adds more confusion. > > > > No because all running tasks have this flag set when context tracking is > > enabled. And context tracking can't be disabled on runtime. > > Yes, yes, please note that I said "if we change this". And "it is racy anyway" connects to the problems we discuss in another thread. Oleg.
On Thu, Jul 31, 2014 at 06:54:23PM +0200, Oleg Nesterov wrote: > On 07/31, Frederic Weisbecker wrote: > > > > On Thu, Jul 31, 2014 at 06:42:46PM +0200, Oleg Nesterov wrote: > > > On 07/31, Frederic Weisbecker wrote: > > > > > > > > On Wed, Jul 30, 2014 at 10:23:34AM -0700, Andy Lutomirski wrote: > > > > > > > > > > At the end of the day, the syscall slowpath code calls a bunch of > > > > > functions depending on what TIF_XYZ flags are set. As long as it's > > > > > structured like "if (TIF_A) do_a(); if (TIF_B) do_b();" or something > > > > > like that, it's comprehensible. But once random functions with no > > > > > explicit flag checks or comments start showing up, it gets confusing. > > > > > > > > Yeah that's a point. I don't mind much the TIF_NOHZ test if you like. > > > > > > And in my opinion > > > > > > if (work & TIF_XYZ) > > > user_exit(); > > > > > > looks even more confusing. Because, once again, TIF_XYZ is not the > > > reason to call user_exit(). > > > > > > Not to mention this adds a minor performance penalty. > > > > That's a point too! You guys both convinced me! ;-) > > Very nice, now I know that you can agree with 2 opposite opinions at > the same time ;) That's what an Acked-by from Schroedinger would look like! > > > > > > If it's indeed all-or-nothing, I could remove the check and add a > > > > > comment. But please keep in mind that, currently, the slow path is > > > > > *slow*, and my patches only improve the entry case. So enabling > > > > > context tracking on every task will hurt. > > > > > > > > That's what we do anyway. I haven't found a safe way to enabled context tracking > > > > without tracking all CPUs. > > > > > > And if we change this, then the code above becomes racy. The state of > > > TIF_XYZ can be changed right after the check. OK, it is racy anyway ;) > > > but still this adds more confusion. > > > > No because all running tasks have this flag set when context tracking is > > enabled. And context tracking can't be disabled on runtime. > > Yes, yes, please note that I said "if we change this". Yeah but the NO_HZ test wouldn't change much the situation I think.
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 39296d2..bbf338a 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -1449,7 +1449,12 @@ long syscall_trace_enter(struct pt_regs *regs) { long ret = 0; - user_exit(); + /* + * If TIF_NOHZ is set, we are required to call user_exit() before + * doing anything that could touch RCU. + */ + if (test_thread_flag(TIF_NOHZ)) + user_exit(); /* * If we stepped into a sysenter/syscall insn, it trapped in
The RCU context tracking code requires that arch code call user_exit() on any entry into kernel code if TIF_NOHZ is set. This patch adds a check for TIF_NOHZ and a comment to the syscall entry tracing code. The main purpose of this patch is to make the code easier to follow: one can read the body of user_exit and of every function it calls without finding any explanation of why it's called for traced syscalls but not for untraced syscalls. This makes it clear when user_exit() is necessary. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> --- arch/x86/kernel/ptrace.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)