Message ID | 20170926095412.74028-2-colyli@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Jens, Could you please take a look on this patch? It will be helpful if we can have it in 4.14, then we can fix a bug introduced in 4.14-rc1. This patch is reported by Michael Lyle, reviewed by Byungchul Park, and finally verified by Michael Lyle after I posted the patch. Many thanks in advance. Coly Li On 2017/9/26 下午5:54, Coly Li wrote: > Commit 09b3efec ("bcache: Don't reinvent the wheel but use existing llist > API") replaces the following while loop by llist_for_each_entry(), > > - > - while (reverse) { > - cl = container_of(reverse, struct closure, list); > - reverse = llist_next(reverse); > - > + llist_for_each_entry(cl, reverse, list) { > closure_set_waiting(cl, 0); > closure_sub(cl, CLOSURE_WAITING + 1); > } > > This modification introduces a potential race by iterating a corrupted > list. Here is how it happens. > > In the above modification, closure_sub() may wake up a process which is > waiting on reverse list. If this process decides to wait again by calling > closure_wait(), its cl->list will be added to another wait list. Then > when llist_for_each_entry() continues to iterate next node, it will travel > on another new wait list which is added in closure_wait(), not the > original reverse list in __closure_wake_up(). It is more probably to > happen on UP machine because the waked up process may preempt the process > which wakes up it. > > Use llist_for_each_entry_safe() will fix the issue, the safe version fetch > next node before waking up a process. Then the copy of next node will make > sure list iteration stays on original reverse list. > > Signed-off-by: Coly Li <colyli@suse.de> > Reported-by: Michael Lyle <mlyle@lyle.org> > Reviewed-by: Byungchul Park <byungchul.park@lge.com> > --- > drivers/md/bcache/closure.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c > index 7d5286b05036..1841d0359bac 100644 > --- a/drivers/md/bcache/closure.c > +++ b/drivers/md/bcache/closure.c > @@ -64,7 +64,7 @@ EXPORT_SYMBOL(closure_put); > void __closure_wake_up(struct closure_waitlist *wait_list) > { > struct llist_node *list; > - struct closure *cl; > + struct closure *cl, *t; > struct llist_node *reverse = NULL; > > list = llist_del_all(&wait_list->list); > @@ -73,7 +73,7 @@ void __closure_wake_up(struct closure_waitlist *wait_list) > reverse = llist_reverse_order(list); > > /* Then do the wakeups */ > - llist_for_each_entry(cl, reverse, list) { > + llist_for_each_entry_safe(cl, t, reverse, list) { > closure_set_waiting(cl, 0); > closure_sub(cl, CLOSURE_WAITING + 1); > } >
On 09/27/2017 09:16 PM, Coly Li wrote: > Hi Jens, > > Could you please take a look on this patch? It will be helpful if we can > have it in 4.14, then we can fix a bug introduced in 4.14-rc1. > > This patch is reported by Michael Lyle, reviewed by Byungchul Park, and > finally verified by Michael Lyle after I posted the patch. It looks fine to me, I'll get it queued up. BTW, it's technically a use-after-free bug, not a race condition.
On 09/26/2017 11:54 AM, Coly Li wrote: > Commit 09b3efec ("bcache: Don't reinvent the wheel but use existing llist > API") replaces the following while loop by llist_for_each_entry(), If it's fixing a specific commit, please add a proper Fixes: sha ("title") line to your patch. That way automated scripts can pick it up, if they backport the problematic patch. Also, as mentioned in the other email, this is technically a use-after-free problem, not a race. Can you resend with commit message fixed up?
Jens-- I think it's a race condition-- the individual closures remain valid. It's just that the list element has different meanings-- it's either a list actively being used to wake, or a linkage on one of several lists that is being used to await wake. If a closure goes back to wait very quickly after being woken, it can end up connecting its new wait-list with the being-woken list. Mike On Wed, Sep 27, 2017 at 1:27 PM, Jens Axboe <axboe@kernel.dk> wrote: > On 09/27/2017 09:16 PM, Coly Li wrote: >> Hi Jens, >> >> Could you please take a look on this patch? It will be helpful if we can >> have it in 4.14, then we can fix a bug introduced in 4.14-rc1. >> >> This patch is reported by Michael Lyle, reviewed by Byungchul Park, and >> finally verified by Michael Lyle after I posted the patch. > > It looks fine to me, I'll get it queued up. BTW, it's technically > a use-after-free bug, not a race condition. > > -- > Jens Axboe > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/27/2017 10:48 PM, Michael Lyle wrote: > Jens-- > > I think it's a race condition-- the individual closures remain valid. > It's just that the list element has different meanings-- it's either a > list actively being used to wake, or a linkage on one of several lists > that is being used to await wake. If a closure goes back to wait very > quickly after being woken, it can end up connecting its new wait-list > with the being-woken list. Reading it a second time, looks like you are right. OK, then let's just queue it up as-is, I'll add the Fixes line this time.
On 2017/9/28 上午4:52, Jens Axboe wrote: > On 09/27/2017 10:48 PM, Michael Lyle wrote: >> Jens-- >> >> I think it's a race condition-- the individual closures remain valid. >> It's just that the list element has different meanings-- it's either a >> list actively being used to wake, or a linkage on one of several lists >> that is being used to await wake. If a closure goes back to wait very >> quickly after being woken, it can end up connecting its new wait-list >> with the being-woken list. > > Reading it a second time, looks like you are right. OK, then let's just > queue it up as-is, I'll add the Fixes line this time. > Hi Jens, Copied, next time for similar situation, Fixes will be there. Thanks for doing this.
diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c index 7d5286b05036..1841d0359bac 100644 --- a/drivers/md/bcache/closure.c +++ b/drivers/md/bcache/closure.c @@ -64,7 +64,7 @@ EXPORT_SYMBOL(closure_put); void __closure_wake_up(struct closure_waitlist *wait_list) { struct llist_node *list; - struct closure *cl; + struct closure *cl, *t; struct llist_node *reverse = NULL; list = llist_del_all(&wait_list->list); @@ -73,7 +73,7 @@ void __closure_wake_up(struct closure_waitlist *wait_list) reverse = llist_reverse_order(list); /* Then do the wakeups */ - llist_for_each_entry(cl, reverse, list) { + llist_for_each_entry_safe(cl, t, reverse, list) { closure_set_waiting(cl, 0); closure_sub(cl, CLOSURE_WAITING + 1); }