Message ID | 1508319532-24655-2-git-send-email-byungchul.park@lge.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 2017-10-18 at 18:38 +0900, Byungchul Park wrote: > Sometimes, we want to initialize completions with sparate lockdep maps > to assign lock classes under control. For example, the workqueue code > manages lockdep maps, as it can classify lockdep maps properly. > Provided a function for that purpose. > > Signed-off-by: Byungchul Park <byungchul.park@lge.com> > --- > include/linux/completion.h | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/completion.h b/include/linux/completion.h > index cae5400..182d56e 100644 > --- a/include/linux/completion.h > +++ b/include/linux/completion.h > @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x) > lock_commit_crosslock((struct lockdep_map *)&x->map); > } > > +#define init_completion_with_map(x, m) \ > +do { \ > + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ > + (m)->name, (m)->key, 0); \ > + __init_completion(x); \ > +} while (0) Are there any completion objects for which the cross-release checking is useful? Are there any wait_for_completion() callers that hold a mutex or other locking object? Thanks, Bart.
On Thu, Oct 19, 2017 at 11:24:00PM +0000, Bart Van Assche wrote: > Are there any completion objects for which the cross-release checking is > useful? Are there any wait_for_completion() callers that hold a mutex or > other locking object? Check /proc/lockdep, then you can find all dependencies wrt cross-lock. I named a lock class of wait_for_completion(), a sting starting with "(complete)". For example, in my machine: console_lock -> (complete)&req.done cpu_hotplug_lock.rw_sem -> (complete)&st->done_up cpuhp_state_mutex -> (complete)&st->done_up and so on.
On Thu, 19 Oct 2017, Bart Van Assche wrote: > On Wed, 2017-10-18 at 18:38 +0900, Byungchul Park wrote: > > Sometimes, we want to initialize completions with sparate lockdep maps > > to assign lock classes under control. For example, the workqueue code > > manages lockdep maps, as it can classify lockdep maps properly. > > Provided a function for that purpose. > > > > Signed-off-by: Byungchul Park <byungchul.park@lge.com> > > --- > > include/linux/completion.h | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/include/linux/completion.h b/include/linux/completion.h > > index cae5400..182d56e 100644 > > --- a/include/linux/completion.h > > +++ b/include/linux/completion.h > > @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x) > > lock_commit_crosslock((struct lockdep_map *)&x->map); > > } > > > > +#define init_completion_with_map(x, m) \ > > +do { \ > > + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ > > + (m)->name, (m)->key, 0); \ > > + __init_completion(x); \ > > +} while (0) > > Are there any completion objects for which the cross-release checking is > useful? All of them by definition. > Are there any wait_for_completion() callers that hold a mutex or > other locking object? Yes, there are also cross completion dependencies. There have been such bugs and I expect more to be unearthed. I really have to ask what your motiviation is to fight the lockdep coverage of synchronization objects tooth and nail? Thanks, tglx
On Fri, 2017-10-20 at 08:34 +0200, Thomas Gleixner wrote: > On Thu, 19 Oct 2017, Bart Van Assche wrote: > > Are there any completion objects for which the cross-release checking is > > useful? > > All of them by definition. Sorry but I'm not sure that's the best possible answer. In my opinion avoiding that completion objects have dependencies on other lock objects, e.g. by avoiding to wait on a completion object while holding a mutex, is a far superior strategy over adding cross-release checking to completion objects. The former strategy namely makes it unnecessary to add cross-release checking to completion objects because that strategy ensures that these completion objects cannot get involved in a deadlock. The latter strategy can lead to false positive deadlock reports by the lockdep code, something none of us wants. A possible alternative strategy could be to enable cross-release checking only for those completion objects for which waiting occurs inside a critical section. > > Are there any wait_for_completion() callers that hold a mutex or > > other locking object? > > Yes, there are also cross completion dependencies. There have been such > bugs and I expect more to be unearthed. > > I really have to ask what your motiviation is to fight the lockdep coverage > of synchronization objects tooth and nail? As explained in another e-mail thread, unlike the lock inversion checking performed by the <= v4.13 lockdep code, cross-release checking is a heuristic that does not have a sound theoretical basis. The lock validator is an important tool for kernel developers. It is important that it produces as few false positives as possible. Since the cross-release checks are enabled automatically when enabling lockdep, I think it is normal that I, as a kernel developer, care that the cross-release checks produce as few false positives as possible. Bart.
On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote: > Sorry but I'm not sure that's the best possible answer. In my opinion > avoiding that completion objects have dependencies on other lock objects, > e.g. by avoiding to wait on a completion object while holding a mutex, is a > far superior strategy over adding cross-release checking to completion > objects. The former strategy namely makes it unnecessary to add > cross-release checking to completion objects because that strategy ensures > that these completion objects cannot get involved in a deadlock. The latter It's true if we force it. But do you think it's possible? > strategy can lead to false positive deadlock reports by the lockdep code, What do you think false positives come from? It comes from assigning lock classes falsely where we should more care, rather than lockdep code itself. The same is applicable to cross-release. > something none of us wants. > > A possible alternative strategy could be to enable cross-release checking > only for those completion objects for which waiting occurs inside a critical > section. Of course, it already did. Cross-release doesn't consider any waiting outside of critical sections at all, and it should do. > As explained in another e-mail thread, unlike the lock inversion checking > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic > that does not have a sound theoretical basis. The lock validator is an It's not heuristic but based on the same theoretical basis as <=4.13 lockdep. I mean, the key basis is: 1) What causes deadlock 2) What is a dependency 3) Build a dependency when identified > important tool for kernel developers. It is important that it produces as few > false positives as possible. Since the cross-release checks are enabled > automatically when enabling lockdep, I think it is normal that I, as a kernel > developer, care that the cross-release checks produce as few false positives > as possible. No doubt. That's why I proposed these patches.
On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote: > On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote: > > As explained in another e-mail thread, unlike the lock inversion checking > > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic > > that does not have a sound theoretical basis. The lock validator is an > > It's not heuristic but based on the same theoretical basis as <=4.13 > lockdep. I mean, the key basis is: > > 1) What causes deadlock > 2) What is a dependency > 3) Build a dependency when identified Sorry but I doubt that that statement is correct. The publication [1] contains a proof that an algorithm that is closely related to the traditional lockdep lock inversion detector is able to detect all deadlocks and does not report false positives for programs that only use mutexes as synchronization objects. The comment of the authors of that paper for programs that use mutexes, condition variables and semaphores is as follows: "It is unclear how to extend the lock-graph-based algorithm in Section 3 to efficiently consider the effects of condition variables and semaphores. Therefore, when considering all three synchronization mechanisms, we currently use a naive algorithm that checks each feasible permutation of the trace for deadlock." In other words, if you have found an approach for detecting potential deadlocks for programs that use these three kinds of synchronization objects and that does not report false positives then that's a breakthrough that's worth publishing in a journal or in the proceedings of a scientific conference. Bart. [1] Agarwal, Rahul, and Scott D. Stoller. "Run-time detection of potential deadlocks for programs with locks, semaphores, and condition variables." In Proceedings of the 2006 workshop on Parallel and distributed systems: testing and debugging, pp. 51-60. ACM, 2006. (https://pdfs.semanticscholar.org/9324/fc0b5d5cd5e05d551a3e98757122039946a2.pdf).
On Sun, Oct 22, 2017 at 02:34:56PM +0000, Bart Van Assche wrote: > On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote: > > On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote: > > > As explained in another e-mail thread, unlike the lock inversion checking > > > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic > > > that does not have a sound theoretical basis. The lock validator is an > > > > It's not heuristic but based on the same theoretical basis as <=4.13 > > lockdep. I mean, the key basis is: > > > > 1) What causes deadlock > > 2) What is a dependency > > 3) Build a dependency when identified > > Sorry but I doubt that that statement is correct. The publication [1] contains IMHO, the paper is talking about totally different things wrt deadlocks by wait_for_event/event, that is, lost events. Furthermore, it doesn't rely on dependencies itself, but just lock ordering 'case by case', which is a subset of the more general concept. > a proof that an algorithm that is closely related to the traditional lockdep > lock inversion detector is able to detect all deadlocks and does not report I can admit this. > false positives for programs that only use mutexes as synchronization objects. I want to ask you. What makes false positives avoidable in the paper? > The comment of the authors of that paper for programs that use mutexes, > condition variables and semaphores is as follows: "It is unclear how to extend > the lock-graph-based algorithm in Section 3 to efficiently consider the effects > of condition variables and semaphores. Therefore, when considering all three > synchronization mechanisms, we currently use a naive algorithm that checks each Right. The paper seems to use a naive algorigm for that cases, not replying on dependencies, which they should. > feasible permutation of the trace for deadlock." In other words, if you have > found an approach for detecting potential deadlocks for programs that use these > three kinds of synchronization objects and that does not report false positives > then that's a breakthrough that's worth publishing in a journal or in the > proceedings of a scientific conference. Please, point out logical problems of cross-release than saying it's impossbile according to the paper. I think you'd better understand how cross-release works *first*. I'll do my best to help you do. > Bart. > > [1] Agarwal, Rahul, and Scott D. Stoller. "Run-time detection of potential > deadlocks for programs with locks, semaphores, and condition variables." In > Proceedings of the 2006 workshop on Parallel and distributed systems: testing > and debugging, pp. 51-60. ACM, 2006. > (https://pdfs.semanticscholar.org/9324/fc0b5d5cd5e05d551a3e98757122039946a2.pdf).
On Mon, 2017-10-23 at 11:08 +0900, Byungchul Park wrote: > On Sun, Oct 22, 2017 at 02:34:56PM +0000, Bart Van Assche wrote: > > On Sat, 2017-10-21 at 11:23 +0900, Byungchul Park wrote: > > > On Sat, Oct 21, 2017 at 4:58 AM, Bart Van Assche <Bart.VanAssche@wdc.com> wrote: > > > > As explained in another e-mail thread, unlike the lock inversion checking > > > > performed by the <= v4.13 lockdep code, cross-release checking is a heuristic > > > > that does not have a sound theoretical basis. The lock validator is an > > > > > > It's not heuristic but based on the same theoretical basis as <=4.13 > > > lockdep. I mean, the key basis is: > > > > > > 1) What causes deadlock > > > 2) What is a dependency > > > 3) Build a dependency when identified > > > > Sorry but I doubt that that statement is correct. The publication [1] contains > > IMHO, the paper is talking about totally different things wrt > deadlocks by wait_for_event/event, that is, lost events. Please reread the paper title. The authors of the paper explain that their algorithm can detect lost events but the most significant contribution of the paper is deadlock detection. > > false positives for programs that only use mutexes as synchronization objects. > > I want to ask you. What makes false positives avoidable in the paper? The algorithm used to detect deadlocks. That algorithm has been explained clearly in the paper. > > The comment of the authors of that paper for programs that use mutexes, > > condition variables and semaphores is as follows: "It is unclear how to extend > > the lock-graph-based algorithm in Section 3 to efficiently consider the effects > > of condition variables and semaphores. Therefore, when considering all three > > synchronization mechanisms, we currently use a naive algorithm that checks each > > feasible permutation of the trace for deadlock." In other words, if you have > > found an approach for detecting potential deadlocks for programs that use these > > three kinds of synchronization objects and that does not report false positives > > then that's a breakthrough that's worth publishing in a journal or in the > > proceedings of a scientific conference. > > Please, point out logical problems of cross-release than saying it's > impossbile according to the paper. Isn't that the same? If it's impossible to use lock-graphs for detecting deadlocks in programs that use mutexes, semaphores and condition variables without triggering false positives that means that every approach that tries to detect deadlocks and that is based on lock graphs, including cross-release, must report false positives for certain programs. Bart.
On Wed, Oct 25, 2017 at 07:07:06AM +0000, Bart Van Assche wrote: > > Please, point out logical problems of cross-release than saying it's > > impossbile according to the paper. > > Isn't that the same? If it's impossible to use lock-graphs for detecting deadlocks > in programs that use mutexes, semaphores and condition variables without triggering > false positives that means that every approach that tries to detect deadlocks and > that is based on lock graphs, including cross-release, must report false positives > for certain programs. Right. That's why I'm currently trying to assign lock classes properly where false positives were reported. You seems to say there is another cause of false positives. If yes, please let me know what it is. If you do with an example, it would be more helpful for me to understand you.
diff --git a/include/linux/completion.h b/include/linux/completion.h index cae5400..182d56e 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -49,6 +49,13 @@ static inline void complete_release_commit(struct completion *x) lock_commit_crosslock((struct lockdep_map *)&x->map); } +#define init_completion_with_map(x, m) \ +do { \ + lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ + (m)->name, (m)->key, 0); \ + __init_completion(x); \ +} while (0) + #define init_completion(x) \ do { \ static struct lock_class_key __key; \ @@ -58,6 +65,7 @@ static inline void complete_release_commit(struct completion *x) __init_completion(x); \ } while (0) #else +#define init_completion_with_map(x, m) __init_completion(x) #define init_completion(x) __init_completion(x) static inline void complete_acquire(struct completion *x) {} static inline void complete_release(struct completion *x) {}
Sometimes, we want to initialize completions with sparate lockdep maps to assign lock classes under control. For example, the workqueue code manages lockdep maps, as it can classify lockdep maps properly. Provided a function for that purpose. Signed-off-by: Byungchul Park <byungchul.park@lge.com> --- include/linux/completion.h | 8 ++++++++ 1 file changed, 8 insertions(+)