Message ID | 20200605141629.15347-1-jack@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] bfq: Avoid false bfq queue merging | expand |
> Il giorno 5 giu 2020, alle ore 16:16, Jan Kara <jack@suse.cz> ha scritto: > > bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it > makes sense to merge current bfq queue with the in-service queue. > However if the in-service queue is freshly scheduled and didn't dispatch > any requests yet, bfqd->in_serv_last_pos is stale and contains value > from the previously scheduled bfq queue which can thus result in a bogus > decision that the two queues should be merged. Good catch! > This bug can be observed > for example with the following fio jobfile: > > [global] > direct=0 > ioengine=sync > invalidate=1 > size=1g > rw=read > > [reader] > numjobs=4 > directory=/mnt > > where the 4 processes will end up in the one shared bfq queue although > they do IO to physically very distant files (for some reason I was able to > observe this only with slice_idle=1ms setting). > > Fix the problem by invalidating bfqd->in_serv_last_pos when switching > in-service queue. > Apart from the nonexistent problem that even 0 is a valid LBA :) Acked-by: Paolo Valente <paolo.valente@linaro.org>
On Thu 11-06-20 09:13:07, Paolo Valente wrote: > > > > Il giorno 5 giu 2020, alle ore 16:16, Jan Kara <jack@suse.cz> ha scritto: > > > > bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it > > makes sense to merge current bfq queue with the in-service queue. > > However if the in-service queue is freshly scheduled and didn't dispatch > > any requests yet, bfqd->in_serv_last_pos is stale and contains value > > from the previously scheduled bfq queue which can thus result in a bogus > > decision that the two queues should be merged. > > Good catch! > > > This bug can be observed > > for example with the following fio jobfile: > > > > [global] > > direct=0 > > ioengine=sync > > invalidate=1 > > size=1g > > rw=read > > > > [reader] > > numjobs=4 > > directory=/mnt > > > > where the 4 processes will end up in the one shared bfq queue although > > they do IO to physically very distant files (for some reason I was able to > > observe this only with slice_idle=1ms setting). > > > > Fix the problem by invalidating bfqd->in_serv_last_pos when switching > > in-service queue. > > > > Apart from the nonexistent problem that even 0 is a valid LBA :) Yes, I was also thinking about that and decided 0 is "good enough" :). But I just as well just switch to (sector_t)-1 if you think it would be better. > Acked-by: Paolo Valente <paolo.valente@linaro.org> Thanks! Honza
> Il giorno 11 giu 2020, alle ore 10:31, Jan Kara <jack@suse.cz> ha scritto: > > On Thu 11-06-20 09:13:07, Paolo Valente wrote: >> >> >>> Il giorno 5 giu 2020, alle ore 16:16, Jan Kara <jack@suse.cz> ha scritto: >>> >>> bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it >>> makes sense to merge current bfq queue with the in-service queue. >>> However if the in-service queue is freshly scheduled and didn't dispatch >>> any requests yet, bfqd->in_serv_last_pos is stale and contains value >>> from the previously scheduled bfq queue which can thus result in a bogus >>> decision that the two queues should be merged. >> >> Good catch! >> >>> This bug can be observed >>> for example with the following fio jobfile: >>> >>> [global] >>> direct=0 >>> ioengine=sync >>> invalidate=1 >>> size=1g >>> rw=read >>> >>> [reader] >>> numjobs=4 >>> directory=/mnt >>> >>> where the 4 processes will end up in the one shared bfq queue although >>> they do IO to physically very distant files (for some reason I was able to >>> observe this only with slice_idle=1ms setting). >>> >>> Fix the problem by invalidating bfqd->in_serv_last_pos when switching >>> in-service queue. >>> >> >> Apart from the nonexistent problem that even 0 is a valid LBA :) > > Yes, I was also thinking about that and decided 0 is "good enough" :). But > I just as well just switch to (sector_t)-1 if you think it would be better. > 0 is ok :) Thanks, Paolo >> Acked-by: Paolo Valente <paolo.valente@linaro.org> > > Thanks! > > Honza > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
> Il giorno 11 giu 2020, alle ore 16:12, Paolo Valente <paolo.valente@linaro.org> ha scritto: > > > >> Il giorno 11 giu 2020, alle ore 10:31, Jan Kara <jack@suse.cz> ha scritto: >> >> On Thu 11-06-20 09:13:07, Paolo Valente wrote: >>> >>> >>>> Il giorno 5 giu 2020, alle ore 16:16, Jan Kara <jack@suse.cz> ha scritto: >>>> >>>> bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it >>>> makes sense to merge current bfq queue with the in-service queue. >>>> However if the in-service queue is freshly scheduled and didn't dispatch >>>> any requests yet, bfqd->in_serv_last_pos is stale and contains value >>>> from the previously scheduled bfq queue which can thus result in a bogus >>>> decision that the two queues should be merged. >>> >>> Good catch! >>> >>>> This bug can be observed >>>> for example with the following fio jobfile: >>>> >>>> [global] >>>> direct=0 >>>> ioengine=sync >>>> invalidate=1 >>>> size=1g >>>> rw=read >>>> >>>> [reader] >>>> numjobs=4 >>>> directory=/mnt >>>> >>>> where the 4 processes will end up in the one shared bfq queue although >>>> they do IO to physically very distant files (for some reason I was able to >>>> observe this only with slice_idle=1ms setting). >>>> >>>> Fix the problem by invalidating bfqd->in_serv_last_pos when switching >>>> in-service queue. >>>> >>> >>> Apart from the nonexistent problem that even 0 is a valid LBA :) >> >> Yes, I was also thinking about that and decided 0 is "good enough" :). But >> I just as well just switch to (sector_t)-1 if you think it would be better. >> > > 0 is ok :) > Hi Jan, I've finally tested this patch of yours. No regression. Once again: Acked-by: Paolo Valente <paolo.valente@linaro.org> Thanks, Paolo > Thanks, > Paolo > >>> Acked-by: Paolo Valente <paolo.valente@linaro.org> >> >> Thanks! >> >> Honza >> >> -- >> Jan Kara <jack@suse.com> >> SUSE Labs, CR
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 3d411716d7ee..50017275915f 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2937,6 +2937,7 @@ static void __bfq_set_in_service_queue(struct bfq_data *bfqd, } bfqd->in_service_queue = bfqq; + bfqd->in_serv_last_pos = 0; } /*
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged. This bug can be observed for example with the following fio jobfile: [global] direct=0 ioengine=sync invalidate=1 size=1g rw=read [reader] numjobs=4 directory=/mnt where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting). Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue. Fixes: 058fdecc6de7 ("block, bfq: fix in-service-queue check for queue merging") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> --- block/bfq-iosched.c | 1 + 1 file changed, 1 insertion(+)