Message ID | 20170705020635.GD11168@linux-80c1.suse (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave@stgolabs.net> wrote: > > So here's something that boots and builds a kernel. Any thoughts? This patch ios just nasty crap. Sorry. It has random whitespace changfes that look entirely unrelated to trhe actual real meat of the patch, and that actually make whitespace *worse*. WHY? That alone should just mean that this patch needs to be thrown away and never ever looked at again. But also, this is fundamentally garbage. Exactly for the same reasons that the swait interfaces were fundamentally broken. It *looks* like it works on regular wait queues, and people can start using it that way, but it actually doesn't have the right semantics at all. The new "lockless" function ONLY works if you don't have a private wakeup function. So it completely changes the semantics of "wake_up_all()". It used to be that "wake_up_all()" did what the name suggested. With this patch, "wake_up_all()" will not cause random kernel oopses or memory corruption if you use any of the more specialized wakeup functions Guys, STOP DOING SHIT LIKE THIS! The swake interface was incredible crap, with bad semantics that confused people, and bad performance too. Don't make the normal wait-queues have similar idiotic problems. So no, this is not only NAK'ed, the whole approach is pure and utter shit and this needs to be buried deep and forgotten about so that it never ever comes back to life. Linus
On Fri, 07 Jul 2017, Linus Torvalds wrote: >On Tue, Jul 4, 2017 at 7:06 PM, Davidlohr Bueso <dave@stgolabs.net> wrote: >> >> So here's something that boots and builds a kernel. Any thoughts? > >This patch ios just nasty crap. Sorry. > >It has random whitespace changfes that look entirely unrelated to trhe >actual real meat of the patch, and that actually make whitespace >*worse*. Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough to just fly by. > >WHY? > >That alone should just mean that this patch needs to be thrown away >and never ever looked at again. > >But also, this is fundamentally garbage. > >Exactly for the same reasons that the swait interfaces were >fundamentally broken. > >It *looks* like it works on regular wait queues, and people can start >using it that way, but it actually doesn't have the right semantics at >all. > >The new "lockless" function ONLY works if you don't have a private >wakeup function. Oh indeed, this was always my intent. Going back to the patch, when checking DEFINE_WAIT_FUNC I clearly overlooked the ->func() implications, breaking all kinds of semantics. With that and the constraints aforementioned in the patch, I see no sane way of using wake_qs. > >So no, this is not only NAK'ed, the whole approach is pure and utter >shit and this needs to be buried deep and forgotten about so that it >never ever comes back to life. Given that you seem to agree that the lockless version is possible as long as we keep semantics, this imho is another point for some form of simplified waitqueues. But yeah. Thanks, Davidlohr
On Fri, Jul 7, 2017 at 3:27 PM, Davidlohr Bueso <dave@stgolabs.net> wrote: > > Ok sorry, fwiw those were 80-line fixlets I thought were trivial enough > to just fly by. I find them annoying, because it makes it so much harder to see what the patch actually does. In this case, I think that more than 50% of the patch was just whitespace changes.. > Oh indeed, this was always my intent. Going back to the patch, when > checking DEFINE_WAIT_FUNC I clearly overlooked the ->func() > implications, breaking all kinds of semantics. With that and the > constraints aforementioned in the patch, I see no sane way of using > wake_qs. Well, very few people actually use "wake_up_all()", particularly for any of the things that use special wake functions. So it probably works in practice. And then somebody starts using pollfd or something on one the things that *do* use wake_up_all() and happens to also allow polling (or whatever), and you get nasty crashes. > Given that you seem to agree that the lockless version is possible as > long as we keep semantics, this imho is another point for some form of > simplified waitqueues. We just really haven't had a lot of problems with the waitqueues in my experience. Many of the historical big problems were about the whole "exclusive vs non-exclusive" thundering herd problems, which is actually the most complex thing about them (the callback function adds a pointer to the wait queue, so makes it bigger, but that is very seldom a huge issue). Most of the things that want specific wakeups tend to be some really low-level stuff (ie semaphores etc - both the sysvipc kind of ones and the kernel locking kind of ones). They are often doing their very special own things anyway. And often the regular waitqueues actually work fine, and the biggest thing is to use the lock within the waitqueue for the object that is being waited on too, so that you just avoid the double locking. So you may have hit the one or two cases where the usual wait-queues didn't work well, but in *most* cases they work wonderfully. Linus
diff --git a/include/linux/wait.h b/include/linux/wait.h index b289c96151ee..9f6075271e52 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -12,8 +12,10 @@ typedef struct wait_queue_entry wait_queue_entry_t; -typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key); -int default_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int flags, void *key); +typedef int (*wait_queue_func_t)(struct wait_queue_entry *wq_entry, + unsigned mode, int flags, void *key); +int default_wake_function(struct wait_queue_entry *wq_entry, + unsigned mode, int flags, void *key); /* wait_queue_entry::flags */ #define WQ_FLAG_EXCLUSIVE 0x01 @@ -56,7 +58,8 @@ struct task_struct; #define DECLARE_WAIT_QUEUE_HEAD(name) \ struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name) -extern void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *name, struct lock_class_key *); +extern void __init_waitqueue_head(struct wait_queue_head *wq_head, + const char *name, struct lock_class_key *); #define init_waitqueue_head(wq_head) \ do { \ @@ -158,39 +161,49 @@ static inline void __add_wait_queue(struct wait_queue_head *wq_head, struct wait * Used for wake-one threads: */ static inline void -__add_wait_queue_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__add_wait_queue_exclusive(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { wq_entry->flags |= WQ_FLAG_EXCLUSIVE; __add_wait_queue(wq_head, wq_entry); } -static inline void __add_wait_queue_entry_tail(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +static inline void +__add_wait_queue_entry_tail(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { list_add_tail(&wq_entry->entry, &wq_head->head); } static inline void -__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__add_wait_queue_entry_tail_exclusive(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { wq_entry->flags |= WQ_FLAG_EXCLUSIVE; __add_wait_queue_entry_tail(wq_head, wq_entry); } static inline void -__remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +__remove_wait_queue(struct wait_queue_head *wq_head, + struct wait_queue_entry *wq_entry) { list_del(&wq_entry->entry); } -void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key); -void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key); -void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode, int nr, void *key); -void __wake_up_locked(struct wait_queue_head *wq_head, unsigned int mode, int nr); +void __wake_up(struct wait_queue_head *wq_head, + unsigned int mode, int nr, void *key); +void __wake_up_all_lockless(struct wait_queue_head *wq_head); +void __wake_up_locked_key(struct wait_queue_head *wq_head, + unsigned int mode, void *key); +void __wake_up_sync_key(struct wait_queue_head *wq_head, + unsigned int mode, int nr, void *key); +void __wake_up_locked(struct wait_queue_head *wq_head, + unsigned int mode, int nr); void __wake_up_sync(struct wait_queue_head *wq_head, unsigned int mode, int nr); #define wake_up(x) __wake_up(x, TASK_NORMAL, 1, NULL) #define wake_up_nr(x, nr) __wake_up(x, TASK_NORMAL, nr, NULL) -#define wake_up_all(x) __wake_up(x, TASK_NORMAL, 0, NULL) +#define wake_up_all(x) __wake_up_all_lockless(x) #define wake_up_locked(x) __wake_up_locked((x), TASK_NORMAL, 1) #define wake_up_all_locked(x) __wake_up_locked((x), TASK_NORMAL, 0) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 17f11c6b0a9f..d029e13529ed 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -5,6 +5,7 @@ */ #include <linux/init.h> #include <linux/export.h> +#include <linux/sched/wake_q.h> #include <linux/sched/signal.h> #include <linux/sched/debug.h> #include <linux/mm.h> @@ -98,6 +99,41 @@ void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, } EXPORT_SYMBOL(__wake_up); +/** + * __wake_up_all_lockless - wake up all threads blocked on a waitqueue, with + * assistance from wake_qs. + * @wq_head: the waitqueue + * + * Using lockless wake_qs allows the wakeups to occur after releasing the + * @wq_head->lock, thus reducing hold times. As such, it only makes sense + * when there is more than a single wakeup to be performed, otherwise it's + * not really worth it. Similarly, this can only be used for TASK_NORMAL + * wakeups, such that we avoid queues with mixed states. + * + * It may be assumed that unlike __wake_up() this function always implies + * a memory barrier before changing the task state due to wake_q_add(), + * regardless of whether or not the task was actually running. + */ +void __wake_up_all_lockless(struct wait_queue_head *wq_head) +{ + unsigned long flags; + wait_queue_entry_t *curr, *next; + DEFINE_WAKE_Q(wake_q); + + spin_lock_irqsave(&wq_head->lock, flags); + /* + * Because we are to awake all tasks, we don't care about + * dealing with WQ_FLAG_EXCLUSIVE cases. + */ + list_for_each_entry_safe(curr, next, &wq_head->head, entry) + wake_q_add(&wake_q, curr->private); + + spin_unlock_irqrestore(&wq_head->lock, flags); + + wake_up_q(&wake_q); +} +EXPORT_SYMBOL(__wake_up_all_lockless); + /* * Same as __wake_up but called with the spinlock in wait_queue_head_t held. */