diff mbox

[04/12] bcache: Don't reinvent the wheel but use existing llist API

Message ID 20170906062602.50497-5-colyli@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Coly Li Sept. 6, 2017, 6:25 a.m. UTC
From: Byungchul Park <byungchul.park@lge.com>

Although llist provides proper APIs, they are not used. Make them used.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Acked-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/closure.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

Comments

Michael Lyle Sept. 26, 2017, 4:38 a.m. UTC | #1
I believe this introduces a critical bug.

cl->list is used to link together the llists for both things waiting,
and for things that are being woken.

If a closure that is woken decides to wait again, it will corrupt the
llist that __closure_wake_up is using.

The previous iteration structure gets the next element of the list
before waking and is therefore safe.

Mike
Byungchul Park Sept. 26, 2017, 6:39 a.m. UTC | #2
> -----Original Message-----

> From: Michael Lyle [mailto:mlyle@lyle.org]

> Sent: Tuesday, September 26, 2017 1:38 PM

> To: Coly Li

> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;

> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet

> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing

> llist API

> 

> I believe this introduces a critical bug.

> 

> cl->list is used to link together the llists for both things waiting,

> and for things that are being woken.

> 

> If a closure that is woken decides to wait again, it will corrupt the

> llist that __closure_wake_up is using.

> 

> The previous iteration structure gets the next element of the list

> before waking and is therefore safe.


Do you mean we have to use llist_for_each_entry_safe() instead of non-safe version?
Is it ok if we use it instead?

> Mike
Coly Li Sept. 26, 2017, 7:08 a.m. UTC | #3
On 2017/9/26 下午12:38, Michael Lyle wrote:
> I believe this introduces a critical bug.
> 
> cl->list is used to link together the llists for both things waiting,
> and for things that are being woken.
> 
> If a closure that is woken decides to wait again, it will corrupt the
> llist that __closure_wake_up is using.
> 
> The previous iteration structure gets the next element of the list
> before waking and is therefore safe.
>

Hi Mike,

Good catch! I see llist_del_all() but forget cl->list can be modified in
closure_wait(). Yes there is potential chance to mislead
llist_for_each_entry() to iterate wrong list.
llist_for_each_entry_safe() should be used here. I will send a fix to
Jens, hope to catch up 4.14 still.

Thanks!
Coly Li Sept. 26, 2017, 7:09 a.m. UTC | #4
On 2017/9/26 下午2:39, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Michael Lyle [mailto:mlyle@lyle.org]
>> Sent: Tuesday, September 26, 2017 1:38 PM
>> To: Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> I believe this introduces a critical bug.
>>
>> cl->list is used to link together the llists for both things waiting,
>> and for things that are being woken.
>>
>> If a closure that is woken decides to wait again, it will corrupt the
>> llist that __closure_wake_up is using.
>>
>> The previous iteration structure gets the next element of the list
>> before waking and is therefore safe.
> 
> Do you mean we have to use llist_for_each_entry_safe() instead of non-safe version?
> Is it ok if we use it instead?

Yes, we should use llist_for_each_entry_safe(), there is a quite
implicit race here.
Byungchul Park Sept. 26, 2017, 7:15 a.m. UTC | #5
> -----Original Message-----

> From: Coly Li [mailto:i@coly.li]

> Sent: Tuesday, September 26, 2017 4:09 PM

> To: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com);

> Michael Lyle; Coly Li

> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;

> axboe@kernel.dk; Eric Wheeler; Kent Overstreet; kernel-team@lge.com

> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing

> llist API

> 

> On 2017/9/26 下午2:39, 박병철/선임연구원/SW

> Platform(연)AOT팀(byungchul.park@lge.com) wrote:

> >> -----Original Message-----

> >> From: Michael Lyle [mailto:mlyle@lyle.org]

> >> Sent: Tuesday, September 26, 2017 1:38 PM

> >> To: Coly Li

> >> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;

> >> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet

> >> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing

> >> llist API

> >>

> >> I believe this introduces a critical bug.

> >>

> >> cl->list is used to link together the llists for both things waiting,

> >> and for things that are being woken.

> >>

> >> If a closure that is woken decides to wait again, it will corrupt the

> >> llist that __closure_wake_up is using.

> >>

> >> The previous iteration structure gets the next element of the list

> >> before waking and is therefore safe.

> >

> > Do you mean we have to use llist_for_each_entry_safe() instead of non-safe

> version?

> > Is it ok if we use it instead?

> 

> Yes, we should use llist_for_each_entry_safe(), there is a quite

> implicit race here.


Hi coly,

As you know, my first patch used the safe version, but you suggested to replace
It with non-safe one. :(

I will change it so it does the same as the original patch did. :)

Thank you very much,
Byungchul

> --

> Coly Li
Byungchul Park Sept. 26, 2017, 7:16 a.m. UTC | #6
> -----Original Message-----

> From: Coly Li [mailto:i@coly.li]

> Sent: Tuesday, September 26, 2017 4:09 PM

> To: Michael Lyle; Coly Li

> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;

> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet

> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing

> llist API

> 

> On 2017/9/26 下午12:38, Michael Lyle wrote:

> > I believe this introduces a critical bug.

> >

> > cl->list is used to link together the llists for both things waiting,

> > and for things that are being woken.

> >

> > If a closure that is woken decides to wait again, it will corrupt the

> > llist that __closure_wake_up is using.

> >

> > The previous iteration structure gets the next element of the list

> > before waking and is therefore safe.

> >

> 

> Hi Mike,

> 

> Good catch! I see llist_del_all() but forget cl->list can be modified in

> closure_wait(). Yes there is potential chance to mislead

> llist_for_each_entry() to iterate wrong list.

> llist_for_each_entry_safe() should be used here. I will send a fix to

> Jens, hope to catch up 4.14 still.


I see. You have a plan to do it. Please fix it.

Thank you.

> Thanks!

> --

> Coly Li
Coly Li Sept. 26, 2017, 7:22 a.m. UTC | #7
On 2017/9/26 下午3:15, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Coly Li [mailto:i@coly.li]
>> Sent: Tuesday, September 26, 2017 4:09 PM
>> To: 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com);
>> Michael Lyle; Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Kent Overstreet; kernel-team@lge.com
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> On 2017/9/26 下午2:39, 박병철/선임연구원/SW
>> Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>>>> -----Original Message-----
>>>> From: Michael Lyle [mailto:mlyle@lyle.org]
>>>> Sent: Tuesday, September 26, 2017 1:38 PM
>>>> To: Coly Li
>>>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>>>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>>>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>>>> llist API
>>>>
>>>> I believe this introduces a critical bug.
>>>>
>>>> cl->list is used to link together the llists for both things waiting,
>>>> and for things that are being woken.
>>>>
>>>> If a closure that is woken decides to wait again, it will corrupt the
>>>> llist that __closure_wake_up is using.
>>>>
>>>> The previous iteration structure gets the next element of the list
>>>> before waking and is therefore safe.
>>>
>>> Do you mean we have to use llist_for_each_entry_safe() instead of non-safe
>> version?
>>> Is it ok if we use it instead?
>>
>> Yes, we should use llist_for_each_entry_safe(), there is a quite
>> implicit race here.
> 
> Hi coly,
> 
> As you know, my first patch used the safe version, but you suggested to replace
> It with non-safe one. :(

No doubt, it's my fault. I didn't find the implicit potential race.

> I will change it so it does the same as the original patch did. :)

Yes, please. Because the original patch is merged into Linux upstream,
post a fix to replace llist_for_each_entry() by
llist_for_each_entry_safe(). We still have chance to catch up 4.14.

Thank you !
Coly Li Sept. 26, 2017, 7:24 a.m. UTC | #8
On 2017/9/26 下午3:16, 박병철/선임연구원/SW
Platform(연)AOT팀(byungchul.park@lge.com) wrote:
>> -----Original Message-----
>> From: Coly Li [mailto:i@coly.li]
>> Sent: Tuesday, September 26, 2017 4:09 PM
>> To: Michael Lyle; Coly Li
>> Cc: linux-bcache@vger.kernel.org; linux-block@vger.kernel.org;
>> axboe@kernel.dk; Eric Wheeler; Byungchul Park; Kent Overstreet
>> Subject: Re: [PATCH 04/12] bcache: Don't reinvent the wheel but use existing
>> llist API
>>
>> On 2017/9/26 下午12:38, Michael Lyle wrote:
>>> I believe this introduces a critical bug.
>>>
>>> cl->list is used to link together the llists for both things waiting,
>>> and for things that are being woken.
>>>
>>> If a closure that is woken decides to wait again, it will corrupt the
>>> llist that __closure_wake_up is using.
>>>
>>> The previous iteration structure gets the next element of the list
>>> before waking and is therefore safe.
>>>
>>
>> Hi Mike,
>>
>> Good catch! I see llist_del_all() but forget cl->list can be modified in
>> closure_wait(). Yes there is potential chance to mislead
>> llist_for_each_entry() to iterate wrong list.
>> llist_for_each_entry_safe() should be used here. I will send a fix to
>> Jens, hope to catch up 4.14 still.
> 
> I see. You have a plan to do it. Please fix it.

Oh, I just see this email, when I replied your last one.
Sure, let me fix my fault.
Coly Li Sept. 26, 2017, 7:46 a.m. UTC | #9
On 2017/9/26 下午12:38, Michael Lyle wrote:
> I believe this introduces a critical bug.
> 
> cl->list is used to link together the llists for both things waiting,
> and for things that are being woken.
> 
> If a closure that is woken decides to wait again, it will corrupt the
> llist that __closure_wake_up is using.
> 
> The previous iteration structure gets the next element of the list
> before waking and is therefore safe.


Hi Michael and Byungchul,

This is my fault to suggest Byungchul to change his correct patch into
wrong one. But it's good to learn such an implicit race behind bcache
code. I just post a patch to explain how this race may happen and
corrupt the reverse list iteration. Could you please to review the fix ?

And thanks to Michael again to catch this bug.
Michael Lyle Sept. 26, 2017, 7:55 p.m. UTC | #10
Thanks everyone--

Yes, this looks good to me and works in testing.

Mike

On Tue, Sep 26, 2017 at 12:46 AM, Coly Li <i@coly.li> wrote:
> On 2017/9/26 下午12:38, Michael Lyle wrote:
>> I believe this introduces a critical bug.
>>
>> cl->list is used to link together the llists for both things waiting,
>> and for things that are being woken.
>>
>> If a closure that is woken decides to wait again, it will corrupt the
>> llist that __closure_wake_up is using.
>>
>> The previous iteration structure gets the next element of the list
>> before waking and is therefore safe.
>
>
> Hi Michael and Byungchul,
>
> This is my fault to suggest Byungchul to change his correct patch into
> wrong one. But it's good to learn such an implicit race behind bcache
> code. I just post a patch to explain how this race may happen and
> corrupt the reverse list iteration. Could you please to review the fix ?
>
> And thanks to Michael again to catch this bug.
>
> --
> Coly Li
diff mbox

Patch

diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 864e673aec39..7d5286b05036 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -70,21 +70,10 @@  void __closure_wake_up(struct closure_waitlist *wait_list)
 	list = llist_del_all(&wait_list->list);
 
 	/* We first reverse the list to preserve FIFO ordering and fairness */
-
-	while (list) {
-		struct llist_node *t = list;
-		list = llist_next(list);
-
-		t->next = reverse;
-		reverse = t;
-	}
+	reverse = llist_reverse_order(list);
 
 	/* Then do the wakeups */
-
-	while (reverse) {
-		cl = container_of(reverse, struct closure, list);
-		reverse = llist_next(reverse);
-
+	llist_for_each_entry(cl, reverse, list) {
 		closure_set_waiting(cl, 0);
 		closure_sub(cl, CLOSURE_WAITING + 1);
 	}