Message ID | 20171115213428.22559-19-samitolvanen@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Sami, On Wed, Nov 15, 2017 at 01:34:28PM -0800, Sami Tolvanen wrote: > Allow CONFIG_LTO_CLANG to be enabled for the architecture. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/arm64/Kconfig | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 3a70f763e18a..58504327b9f6 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -40,6 +40,7 @@ config ARM64 > select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPT > select ARCH_USE_CMPXCHG_LOCKREF > select ARCH_USE_QUEUED_RWLOCKS > + select ARCH_SUPPORTS_LTO_CLANG I'll be honest with you: I'm absolutely terrified about enabling this. How much testing has this seen? The main thing that worries me is that this gives the toolchain a lot more freedom to break dependency ordering with RCU, leading to subtle concurrency issues that would actually break on arm64. Specifically, I'm worried about value analysis that could potentially convert an address dependency into a control dependency. Right now, the C standard isn't on our side here and we're relying on the compiler not doing this kind of thing. Can we continue to rely on that in the face of LTO? Will
On Thu, Nov 16, 2017 at 11:58:11AM +0000, Will Deacon wrote: > I'll be honest with you: I'm absolutely terrified about enabling this. That's understandable, I wouldn't want to enable this by default quite yet either. This patch doesn't enable LTO for arm64, just makes it possible to enable the feature. I'm perfectly fine with marking CONFIG_LTO_CLANG experimental if it makes people more comfortable. > How much testing has this seen? I've been running clang LTO kernels for a few months on a Pixel 2 device without any issues. This is on a 4.4 kernel though. > Right now, the C standard isn't on our side here and we're relying on > the compiler not doing this kind of thing. Can we continue to rely on > that in the face of LTO? I'll have to check with our LLVM experts, but I have not run into these issues with current compiler versions. Looking at Andi's old patches, looks like gcc might be more aggressive in reordering things with LTO than clang. Sami
On Thu, Nov 16, 2017 at 08:17:31AM -0800, Sami Tolvanen wrote: > On Thu, Nov 16, 2017 at 11:58:11AM +0000, Will Deacon wrote: > > I'll be honest with you: I'm absolutely terrified about enabling this. > > That's understandable, I wouldn't want to enable this by default > quite yet either. This patch doesn't enable LTO for arm64, just makes > it possible to enable the feature. I'm perfectly fine with marking > CONFIG_LTO_CLANG experimental if it makes people more comfortable. > > > How much testing has this seen? > > I've been running clang LTO kernels for a few months on a Pixel 2 device > without any issues. This is on a 4.4 kernel though. > > > Right now, the C standard isn't on our side here and we're relying on > > the compiler not doing this kind of thing. Can we continue to rely on > > that in the face of LTO? > > I'll have to check with our LLVM experts, but I have not run into these > issues with current compiler versions. Looking at Andi's old patches, > looks like gcc might be more aggressive in reordering things with LTO > than clang. Ideally we'd get the toolchain people to commit to supporting the kernel memory model along side the C11 one. That would help a ton.
On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: > Ideally we'd get the toolchain people to commit to supporting the kernel > memory model along side the C11 one. That would help a ton. Does anyone from the kernel side participate in the C standardization process?
On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: > On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > > Ideally we'd get the toolchain people to commit to supporting the kernel > > memory model along side the C11 one. That would help a ton. > > Does anyone from the kernel side participate in the C standardization process? Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be reconciled though. From what I understand C11 (and onwards) are incompatible with the kernel model on a number of subtle points. Not to mention that there's people in the C11 process that strongly argue for stuff that would break every single multi-threaded program written since the 70s, which would include pretty much all OS kernels.
On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> >> > Ideally we'd get the toolchain people to commit to supporting the kernel >> > memory model along side the C11 one. That would help a ton. >> >> Does anyone from the kernel side participate in the C standardization process? > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be > reconciled though. From what I understand C11 (and onwards) are > incompatible with the kernel model on a number of subtle points. It would be good to have these incompatibilities written down, then for the sake of argument, they can be cited both for discussions on LKML and in the C standardization process. For example, a running list in Documentation/ or something would make it so that anyone could understand and cite current issues with the latest C standard. I don't understand why we'd block patches for enabling experimental features. We've been running this patch-set on actual devices for months and would love to provide them to the community for further testing. If bugs are found, then there's more evidence to bring to the C standards committee. Otherwise we're shutting down feature development for the sake of potential bugs in a C standard we're not even using.
On Thu, Nov 16, 2017 at 09:16:49AM -0800, Nick Desaulniers wrote: > On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: > >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: > >> > >> > Ideally we'd get the toolchain people to commit to supporting the kernel > >> > memory model along side the C11 one. That would help a ton. > >> > >> Does anyone from the kernel side participate in the C standardization process? > > > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be > > reconciled though. From what I understand C11 (and onwards) are > > incompatible with the kernel model on a number of subtle points. > > It would be good to have these incompatibilities written down, then > for the sake of argument, they can be cited both for discussions on > LKML and in the C standardization process. For example, a running > list in Documentation/ or something would make it so that anyone could > understand and cite current issues with the latest C standard. Will should be able to produce this list; I know he's done before, I just can't find it -- my Google-foo isn't strong today. > I don't understand why we'd block patches for enabling experimental > features. We've been running this patch-set on actual devices for > months and would love to provide them to the community for further > testing. If bugs are found, then there's more evidence to bring to > the C standards committee. Otherwise we're shutting down feature > development for the sake of potential bugs in a C standard we're not > even using. So the problem is that its very very hard (and painful) to find these bugs. Getting the tools people to comment on these specific optimizations would really help lots.
On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: > On Thu, Nov 16, 2017 at 09:16:49AM -0800, Nick Desaulniers wrote: > > On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: > > >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > >> > > >> > Ideally we'd get the toolchain people to commit to supporting the kernel > > >> > memory model along side the C11 one. That would help a ton. > > >> > > >> Does anyone from the kernel side participate in the C standardization process? > > > > > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be > > > reconciled though. From what I understand C11 (and onwards) are > > > incompatible with the kernel model on a number of subtle points. > > > > It would be good to have these incompatibilities written down, then > > for the sake of argument, they can be cited both for discussions on > > LKML and in the C standardization process. For example, a running > > list in Documentation/ or something would make it so that anyone could > > understand and cite current issues with the latest C standard. > > Will should be able to produce this list; I know he's done before, I > just can't find it -- my Google-foo isn't strong today. Here you go: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html > > I don't understand why we'd block patches for enabling experimental > > features. We've been running this patch-set on actual devices for > > months and would love to provide them to the community for further > > testing. If bugs are found, then there's more evidence to bring to > > the C standards committee. Otherwise we're shutting down feature > > development for the sake of potential bugs in a C standard we're not > > even using. > > So the problem is that its very very hard (and painful) to find these > bugs. Getting the tools people to comment on these specific > optimizations would really help lots. It would be good to get something similar to LKMM into KTSAN, for example. There would probably be a few differences due to efficiency concerns, but closer is better than less close. ;-) Thanx, Paul
On Thu, Nov 16, 2017 at 9:48 AM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: >> On Thu, Nov 16, 2017 at 09:16:49AM -0800, Nick Desaulniers wrote: >> > On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> > > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: >> > >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> > >> >> > >> > Ideally we'd get the toolchain people to commit to supporting the kernel >> > >> > memory model along side the C11 one. That would help a ton. >> > >> >> > >> Does anyone from the kernel side participate in the C standardization process? >> > > >> > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be >> > > reconciled though. From what I understand C11 (and onwards) are >> > > incompatible with the kernel model on a number of subtle points. >> > >> > It would be good to have these incompatibilities written down, then >> > for the sake of argument, they can be cited both for discussions on >> > LKML and in the C standardization process. For example, a running >> > list in Documentation/ or something would make it so that anyone could >> > understand and cite current issues with the latest C standard. >> >> Will should be able to produce this list; I know he's done before, I >> just can't find it -- my Google-foo isn't strong today. > > Here you go: > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html Great, thanks! Will take some time to digest, but happy to refer others to this important work in the future. I wonder if we have anything like a case study that shows specifically how a compiler generated a subtle bug due to specifics of the memory model, like a "if you do this, here's the problematic code that will get generated, and why it's problematic due to the memory model." That may be a good way to raise issues with toolchain developers with concrete and actionable examples. >> > I don't understand why we'd block patches for enabling experimental >> > features. We've been running this patch-set on actual devices for >> > months and would love to provide them to the community for further >> > testing. If bugs are found, then there's more evidence to bring to >> > the C standards committee. Otherwise we're shutting down feature >> > development for the sake of potential bugs in a C standard we're not >> > even using. >> >> So the problem is that its very very hard (and painful) to find these >> bugs. Getting the tools people to comment on these specific >> optimizations would really help lots. No doubt; I do not disagree with you. Kernel developers have very important use cases for the language. But the core point I'm trying to make is "do we need to block this patch set until issues with the C standards body in regards to the kernels memory model are resolved?" I would hope the two are orthogonal and that we'd take them and then test them even more extensively than the developer has in order to find out. > It would be good to get something similar to LKMM into KTSAN, for > example. There would probably be a few differences due to efficiency > concerns, but closer is better than less close. ;-) + glider, who may be able to comment on the state of KTSAN.
On Thu, Nov 16, 2017 at 10:16:22AM -0800, Nick Desaulniers wrote: > On Thu, Nov 16, 2017 at 9:48 AM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: > >> On Thu, Nov 16, 2017 at 09:16:49AM -0800, Nick Desaulniers wrote: > >> > On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: > >> > > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: > >> > >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: > >> > >> > >> > >> > Ideally we'd get the toolchain people to commit to supporting the kernel > >> > >> > memory model along side the C11 one. That would help a ton. > >> > >> > >> > >> Does anyone from the kernel side participate in the C standardization process? > >> > > > >> > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be > >> > > reconciled though. From what I understand C11 (and onwards) are > >> > > incompatible with the kernel model on a number of subtle points. > >> > > >> > It would be good to have these incompatibilities written down, then > >> > for the sake of argument, they can be cited both for discussions on > >> > LKML and in the C standardization process. For example, a running > >> > list in Documentation/ or something would make it so that anyone could > >> > understand and cite current issues with the latest C standard. > >> > >> Will should be able to produce this list; I know he's done before, I > >> just can't find it -- my Google-foo isn't strong today. > > > > Here you go: > > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html > > Great, thanks! Will take some time to digest, but happy to refer > others to this important work in the future. Glad you like it! > I wonder if we have anything like a case study that shows specifically > how a compiler generated a subtle bug due to specifics of the memory > model, like a "if you do this, here's the problematic code that will > get generated, and why it's problematic due to the memory model." > That may be a good way to raise issues with toolchain developers with > concrete and actionable examples. Well, the above is an official working paper from the C++ standards committee. The first priority is to fix memory_order_consume. Which is getting a bit more mindshare of late. As Fedor Pikus said at CPPCON: "If you have a large software project, you are probably already using RCU. But you don't know that you are using it, and you are probably doing it wrong." > >> > I don't understand why we'd block patches for enabling experimental > >> > features. We've been running this patch-set on actual devices for > >> > months and would love to provide them to the community for further > >> > testing. If bugs are found, then there's more evidence to bring to > >> > the C standards committee. Otherwise we're shutting down feature > >> > development for the sake of potential bugs in a C standard we're not > >> > even using. > >> > >> So the problem is that its very very hard (and painful) to find these > >> bugs. Getting the tools people to comment on these specific > >> optimizations would really help lots. > > No doubt; I do not disagree with you. Kernel developers have very > important use cases for the language. > > But the core point I'm trying to make is "do we need to block this > patch set until issues with the C standards body in regards to the > kernels memory model are resolved?" I would hope the two are > orthogonal and that we'd take them and then test them even more > extensively than the developer has in order to find out. Given that I have been working on getting the C and C++ standards to correctly handle rcu_dereference() for more than ten years, I recommend -against- waiting on standardization in the strongest possible terms. And if you think that ten years is bad, the Java standards community has been struggling with out-of-thin-air (OOTA) values for almost 20 years. And the C and C++ standards haven't solved OOTA, either. Thanx, Paul > > It would be good to get something similar to LKMM into KTSAN, for > > example. There would probably be a few differences due to efficiency > > concerns, but closer is better than less close. ;-) > > + glider, who may be able to comment on the state of KTSAN. >
On Thu, Nov 16, 2017 at 10:39:51AM -0800, Paul E. McKenney wrote: > On Thu, Nov 16, 2017 at 10:16:22AM -0800, Nick Desaulniers wrote: > > > On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: > > >> So the problem is that its very very hard (and painful) to find these > > >> bugs. Getting the tools people to comment on these specific > > >> optimizations would really help lots. > > > > No doubt; I do not disagree with you. Kernel developers have very > > important use cases for the language. > > > > But the core point I'm trying to make is "do we need to block this > > patch set until issues with the C standards body in regards to the > > kernels memory model are resolved?" I would hope the two are > > orthogonal and that we'd take them and then test them even more > > extensively than the developer has in order to find out. > > Given that I have been working on getting the C and C++ standards to > correctly handle rcu_dereference() for more than ten years, I recommend > -against- waiting on standardization in the strongest possible terms. > And if you think that ten years is bad, the Java standards community has > been struggling with out-of-thin-air (OOTA) values for almost 20 years. > And the C and C++ standards haven't solved OOTA, either. The problem is, if we go ahead with this change, the compiler *will* break some address dependencies and something will eventually go wrong. At that point, what do we do? Turn off some random compiler optimisation? Add a random barrier()? We don't necessarily need standardisation, but we at least need guarantees from the compiler implementation that LTO/PGO will respect source level address dependencies. I don't think we have that today. Will
On Thu, Nov 16, 2017 at 06:45:08PM +0000, Will Deacon wrote: > On Thu, Nov 16, 2017 at 10:39:51AM -0800, Paul E. McKenney wrote: > > On Thu, Nov 16, 2017 at 10:16:22AM -0800, Nick Desaulniers wrote: > > > > On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: > > > >> So the problem is that its very very hard (and painful) to find these > > > >> bugs. Getting the tools people to comment on these specific > > > >> optimizations would really help lots. > > > > > > No doubt; I do not disagree with you. Kernel developers have very > > > important use cases for the language. > > > > > > But the core point I'm trying to make is "do we need to block this > > > patch set until issues with the C standards body in regards to the > > > kernels memory model are resolved?" I would hope the two are > > > orthogonal and that we'd take them and then test them even more > > > extensively than the developer has in order to find out. > > > > Given that I have been working on getting the C and C++ standards to > > correctly handle rcu_dereference() for more than ten years, I recommend > > -against- waiting on standardization in the strongest possible terms. > > And if you think that ten years is bad, the Java standards community has > > been struggling with out-of-thin-air (OOTA) values for almost 20 years. > > And the C and C++ standards haven't solved OOTA, either. > > The problem is, if we go ahead with this change, the compiler *will* break > some address dependencies and something will eventually go wrong. At that > point, what do we do? Turn off some random compiler optimisation? Add a > random barrier()? > > We don't necessarily need standardisation, but we at least need guarantees > from the compiler implementation that LTO/PGO will respect source level > address dependencies. I don't think we have that today. Ah, if "this patch set" meant "adding LTO", I stand corrected and I apologize for my confusion. I agree that we need LTO/PGO to be housebroken from an LKMM viewpoint before it is enabled. Thanx, Paul
On Thu, Nov 16, 2017 at 11:13:07AM -0800, Paul E. McKenney wrote: > Ah, if "this patch set" meant "adding LTO", I stand corrected and I > apologize for my confusion. Again, I'm not proposing for LTO to be enabled by default. These patches just make it possible to enable it. Are you saying the possibility that a future compiler update breaks something is a blocker even for experimental features? > I agree that we need LTO/PGO to be housebroken from an LKMM viewpoint > before it is enabled. Can you elaborate what's needed from clang before this can move forward? For example, if you have specific test cases in mind, we can always work on including them in the LLVM test suite. Sami
Hi Sami, On Thu, Nov 16, 2017 at 12:17:01PM -0800, Sami Tolvanen wrote: > On Thu, Nov 16, 2017 at 11:13:07AM -0800, Paul E. McKenney wrote: > > Ah, if "this patch set" meant "adding LTO", I stand corrected and I > > apologize for my confusion. > > Again, I'm not proposing for LTO to be enabled by default. These patches > just make it possible to enable it. Are you saying the possibility > that a future compiler update breaks something is a blocker even for > experimental features? My opinion here is that it's a blocker for LTO on arm64, as the symptoms aren't different to building with a compiler that has subtle code generation issues. If you *really* want to support LTO, we could consider giving acquire semantics to READ_ONCE for LTO builds, but that means trading RCU read-side performance for whatever gains are provided by LTO. Please don't confuse this with a reluctance to support clang; I'm keen to see that supported, so let's focus on that for the moment (although I don't see a good reason to support old clang builds with known issues). > > I agree that we need LTO/PGO to be housebroken from an LKMM viewpoint > > before it is enabled. > > Can you elaborate what's needed from clang before this can move > forward? For example, if you have specific test cases in mind, we can > always work on including them in the LLVM test suite. This is a thorny issue, but RCU (specifically rcu_dereference but probably also some READ_ONCEs) relies on being able to utilise syntactic dependency chains to order local accesses to shared variables. Paul has spent a fair amount of time working to define these dependency chains here: https://wg21.link/P0190 Although the current direction of the C++ committee is to prefer that dependencies are explicitly "marked", this is not deemed to be acceptable for the kernel (in other words, everything is always considered "marked"). If a compiler is capable of breaking unmarked dependency chains, then we will run into subtle concurrency issues on arm64 because the hardware is also capable of reordering independent accesses. My understanding is that LTO makes these sort of optimisations far more likely because whole-program analysis can enable value prediction, which can convert address dependencies into control dependcies; the latter not ordering reads on arm64. Will
On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > Please don't confuse this with a reluctance to support clang; I'm keen to > see that supported, As an aside; as long as clang doesn't do asm-goto and asm-flag-output (as examples of features that clang lacks and developers have at times stated they would never support) I can't take them serious as a target compiler.
On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > This is a thorny issue, but RCU (specifically rcu_dereference but probably > also some READ_ONCEs) relies on being able to utilise syntactic dependency > chains to order local accesses to shared variables. Well, we used to have READ_ONCE() and smp_read_barrier_depends(), but we recently munged them together, in the process getting rid of lockless_dereference(). So for sure, READ_ONCE() must be able to do the address dependency thing, otherwise tons of code comes apart.
On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > Although the current direction of the C++ committee is to prefer > that dependencies are explicitly "marked", this is not deemed to be > acceptable for the kernel (in other words, everything is always considered > "marked"). Yeah, that is an attitude not compatible with existing code. Much like the proposal to allow temporary/wide stores on everything not explicitly declared atomic. Such stuff instantly breaks all extant code that does multi-threading with no recourse.
On Mon, Nov 20, 2017 at 08:28:06PM +0100, Peter Zijlstra wrote: > On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > > This is a thorny issue, but RCU (specifically rcu_dereference but probably > > also some READ_ONCEs) relies on being able to utilise syntactic dependency > > chains to order local accesses to shared variables. > > Well, we used to have READ_ONCE() and smp_read_barrier_depends(), but > we recently munged them together, in the process getting rid of > lockless_dereference(). > > So for sure, READ_ONCE() must be able to do the address dependency > thing, otherwise tons of code comes apart. I do hope that they track the volatile accesses produced by READ_ONCE(). Otherwise, it would not be good to apply this to anything touching MMIO. Thanx, Paul
On Mon, Nov 20, 2017 at 08:32:56PM +0100, Peter Zijlstra wrote: > On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > > Although the current direction of the C++ committee is to prefer > > that dependencies are explicitly "marked", this is not deemed to be > > acceptable for the kernel (in other words, everything is always considered > > "marked"). > > Yeah, that is an attitude not compatible with existing code. Much like > the proposal to allow temporary/wide stores on everything not explicitly > declared atomic. Such stuff instantly breaks all extant code that does > multi-threading with no recourse. If someone suggests temporary/wide stores, even on non-atomics, tell them that the standard does not permit them to introduce data races. Thanx, Paul
From: Paul E. McKenney > Sent: 20 November 2017 20:54 > > On Mon, Nov 20, 2017 at 08:32:56PM +0100, Peter Zijlstra wrote: > > On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > > > Although the current direction of the C++ committee is to prefer > > > that dependencies are explicitly "marked", this is not deemed to be > > > acceptable for the kernel (in other words, everything is always considered > > > "marked"). > > > > Yeah, that is an attitude not compatible with existing code. Much like > > the proposal to allow temporary/wide stores on everything not explicitly > > declared atomic. Such stuff instantly breaks all extant code that does > > multi-threading with no recourse. > > If someone suggests temporary/wide stores, even on non-atomics, tell > them that the standard does not permit them to introduce data races. The C standard doesn't say anything about multi-threading. The x86 bis (bit set) family are well known for being problematic because they always do a 32bit wide rmw cycle. David
On Tue, Nov 21, 2017 at 05:23:52PM +0000, David Laight wrote: > From: Paul E. McKenney > > Sent: 20 November 2017 20:54 > > > > On Mon, Nov 20, 2017 at 08:32:56PM +0100, Peter Zijlstra wrote: > > > On Mon, Nov 20, 2017 at 06:05:55PM +0000, Will Deacon wrote: > > > > Although the current direction of the C++ committee is to prefer > > > > that dependencies are explicitly "marked", this is not deemed to be > > > > acceptable for the kernel (in other words, everything is always considered > > > > "marked"). > > > > > > Yeah, that is an attitude not compatible with existing code. Much like > > > the proposal to allow temporary/wide stores on everything not explicitly > > > declared atomic. Such stuff instantly breaks all extant code that does > > > multi-threading with no recourse. > > > > If someone suggests temporary/wide stores, even on non-atomics, tell > > them that the standard does not permit them to introduce data races. > > The C standard doesn't say anything about multi-threading. Actually, recent versions of the C standard really do cover multi-threading, and have for some years. For example, the June 2010 draft has this to say in section 5.1.2.4: Under a hosted implementation, a program can have more than one thread of execution (or thread) running concurrently. Later, in paragraph 25 of this same section: The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. Because the compiler is not allowed to introduce undefined behavior in a program that does not already contain undefined behavior, the compiler is absolutely forbidden from inventing stores unless it can prove that doing so does not introduce a data race. One (painful and annoying) case in which it can prove this is just before a normal (non-volatile and non-atomic) store. > The x86 bis (bit set) family are well known for being problematic > because they always do a 32bit wide rmw cycle. If the compiler is careful, it can invent atomic read-modify-write cycles to uninvolved variables. Here "is careful" includes ensuring that any read from or write to one of those uninvolved variables acts just as it would in the absence of the atomic read-modify-write cycle. But I did say "store" above, not atomic read-modify-write operation. ;-) Thanx, Paul
On Thu, Nov 16, 2017 at 7:16 PM, Nick Desaulniers <ndesaulniers@google.com> wrote: > On Thu, Nov 16, 2017 at 9:48 AM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: >> On Thu, Nov 16, 2017 at 06:34:17PM +0100, Peter Zijlstra wrote: >>> On Thu, Nov 16, 2017 at 09:16:49AM -0800, Nick Desaulniers wrote: >>> > On Thu, Nov 16, 2017 at 8:59 AM, Peter Zijlstra <peterz@infradead.org> wrote: >>> > > On Thu, Nov 16, 2017 at 08:50:41AM -0800, Nick Desaulniers wrote: >>> > >> On Thu, Nov 16, 2017 at 8:30 AM, Peter Zijlstra <peterz@infradead.org> wrote: >>> > >> >>> > >> > Ideally we'd get the toolchain people to commit to supporting the kernel >>> > >> > memory model along side the C11 one. That would help a ton. >>> > >> >>> > >> Does anyone from the kernel side participate in the C standardization process? >>> > > >>> > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be >>> > > reconciled though. From what I understand C11 (and onwards) are >>> > > incompatible with the kernel model on a number of subtle points. >>> > >>> > It would be good to have these incompatibilities written down, then >>> > for the sake of argument, they can be cited both for discussions on >>> > LKML and in the C standardization process. For example, a running >>> > list in Documentation/ or something would make it so that anyone could >>> > understand and cite current issues with the latest C standard. >>> >>> Will should be able to produce this list; I know he's done before, I >>> just can't find it -- my Google-foo isn't strong today. >> >> Here you go: >> >> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html > > Great, thanks! Will take some time to digest, but happy to refer > others to this important work in the future. > > I wonder if we have anything like a case study that shows specifically > how a compiler generated a subtle bug due to specifics of the memory > model, like a "if you do this, here's the problematic code that will > get generated, and why it's problematic due to the memory model." > That may be a good way to raise issues with toolchain developers with > concrete and actionable examples. > >>> > I don't understand why we'd block patches for enabling experimental >>> > features. We've been running this patch-set on actual devices for >>> > months and would love to provide them to the community for further >>> > testing. If bugs are found, then there's more evidence to bring to >>> > the C standards committee. Otherwise we're shutting down feature >>> > development for the sake of potential bugs in a C standard we're not >>> > even using. >>> >>> So the problem is that its very very hard (and painful) to find these >>> bugs. Getting the tools people to comment on these specific >>> optimizations would really help lots. > > No doubt; I do not disagree with you. Kernel developers have very > important use cases for the language. > > But the core point I'm trying to make is "do we need to block this > patch set until issues with the C standards body in regards to the > kernels memory model are resolved?" I would hope the two are > orthogonal and that we'd take them and then test them even more > extensively than the developer has in order to find out. > >> It would be good to get something similar to LKMM into KTSAN, for >> example. There would probably be a few differences due to efficiency >> concerns, but closer is better than less close. ;-) > > + glider, who may be able to comment on the state of KTSAN. We haven't touched KTSAN for a while, so it's probably broken right now. It should be possible to revive it, the question is how much code will need to be annotated to prevent the tool from overwhelming the developers with reports. +Dima and Andrey, who should know better.
On Thu, Nov 23, 2017 at 2:42 PM, Alexander Potapenko <glider@google.com> wrote: >>>> > >> > Ideally we'd get the toolchain people to commit to supporting the kernel >>>> > >> > memory model along side the C11 one. That would help a ton. >>>> > >> >>>> > >> Does anyone from the kernel side participate in the C standardization process? >>>> > > >>>> > > Yes, Paul McKenney and Will Deacon. Doesn't mean these two can still be >>>> > > reconciled though. From what I understand C11 (and onwards) are >>>> > > incompatible with the kernel model on a number of subtle points. >>>> > >>>> > It would be good to have these incompatibilities written down, then >>>> > for the sake of argument, they can be cited both for discussions on >>>> > LKML and in the C standardization process. For example, a running >>>> > list in Documentation/ or something would make it so that anyone could >>>> > understand and cite current issues with the latest C standard. >>>> >>>> Will should be able to produce this list; I know he's done before, I >>>> just can't find it -- my Google-foo isn't strong today. >>> >>> Here you go: >>> >>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html >> >> Great, thanks! Will take some time to digest, but happy to refer >> others to this important work in the future. >> >> I wonder if we have anything like a case study that shows specifically >> how a compiler generated a subtle bug due to specifics of the memory >> model, like a "if you do this, here's the problematic code that will >> get generated, and why it's problematic due to the memory model." >> That may be a good way to raise issues with toolchain developers with >> concrete and actionable examples. >> >>>> > I don't understand why we'd block patches for enabling experimental >>>> > features. We've been running this patch-set on actual devices for >>>> > months and would love to provide them to the community for further >>>> > testing. If bugs are found, then there's more evidence to bring to >>>> > the C standards committee. Otherwise we're shutting down feature >>>> > development for the sake of potential bugs in a C standard we're not >>>> > even using. >>>> >>>> So the problem is that its very very hard (and painful) to find these >>>> bugs. Getting the tools people to comment on these specific >>>> optimizations would really help lots. >> >> No doubt; I do not disagree with you. Kernel developers have very >> important use cases for the language. >> >> But the core point I'm trying to make is "do we need to block this >> patch set until issues with the C standards body in regards to the >> kernels memory model are resolved?" I would hope the two are >> orthogonal and that we'd take them and then test them even more >> extensively than the developer has in order to find out. >> >>> It would be good to get something similar to LKMM into KTSAN, for >>> example. There would probably be a few differences due to efficiency >>> concerns, but closer is better than less close. ;-) >> >> + glider, who may be able to comment on the state of KTSAN. > We haven't touched KTSAN for a while, so it's probably broken right now. > It should be possible to revive it, the question is how much code will > need to be annotated to prevent the tool from overwhelming the > developers with reports. > +Dima and Andrey, who should know better. Hi, KTSAN checks acquire/release pairs, and that's very useful. But as far as I understand this thread is about more subtle things and areas of kernel/compiler tension. I afraid this in this context KTSAN is in the same boat as compiler. Because, well, we need to write code that implements precise algorithms. And if control-dependencies are defined as "extreme care is required to avoid control-dependency-destroying compiler optimizations" (that is, code is correct if it does not break against the current set of enabled optimizations in the current compiler, what?) and data-dependencies are defined akin to C11 definition (read -- non-implementable, unicorns); then KTSAN can't help. When/if C provides implementable rules for data-dependencies (_Dependency) and that's implemented in compilers and kernel sticks to this model, then I guess it should be possible to extract that info from compiler and check against it in KTSAN (e.g. 2 synchronization domains, one for dependent accesses and one for everything else). Kernel could as well define own model, and KTSAN could check against it. But that model must be implemented in compilers first anyway. Because (1) doing it just for KTSAN does not look reasonable, (2) until compiler supports that model there is little point in checking (the fact that compiler uses a different model is the major gaping hole and we are aware of it without tooling help). And, yes, I agree that we should not block this LTO patch. All problems are already there and are orthogonal to LTO. Compiler sees enough code already (large TUs, lots of code in headers) and we move code. I also have not seen any special rules wrt rcu and translation units, I have not seen developers doing any additional analysis re TUs, move code to separate files, nor I seen comments says that this code must be in separate TU than that code. From what I see usually it's assumed that things will just work. If anything LTO will be useful to shake out latent bugs that will pop up later.
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3a70f763e18a..58504327b9f6 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -40,6 +40,7 @@ config ARM64 select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPT select ARCH_USE_CMPXCHG_LOCKREF select ARCH_USE_QUEUED_RWLOCKS + select ARCH_SUPPORTS_LTO_CLANG select ARCH_SUPPORTS_MEMORY_FAILURE select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_NUMA_BALANCING
Allow CONFIG_LTO_CLANG to be enabled for the architecture. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+)