mbox series

[0/2,RFC] bfq: Waker logic tweaks for dbench performance

Message ID 20200409170915.30570-1-jack@suse.cz (mailing list archive)
Headers show
Series bfq: Waker logic tweaks for dbench performance | expand

Message

Jan Kara April 9, 2020, 5:09 p.m. UTC
Hello,

I was investigating why dbench performance (even for single dbench client) with
BFQ is significantly worse than it used to be CFQ. The culprit is the idling
logic in BFQ. The dbench workload is very fsync(2) heavy. In practice the time
to complete fsync(2) calls is what determines the overall performance. For
filesystems with a journal such as xfs or ext4 it is common that fsync(2)
involves writing data from the process runningg fsync(2) - dbench in this case
- and then waiting for the journalling machinery to flush out metadata from a
separate process (jbd2 process in ext4 case, worker thread in xfs case).
CFQ's heuristic was able to determine that it isn't worth to idle waiting for
either dbench or jbd2 IO. BFQ's heuristic is not able to determine this and
thus jbd2 process is often blocked waiting for idle timer of dbench queue to
trigger.

The first patch in the series is an obvious bugfix but is not enough to improve
performance. The second patch does improve dbench performance from ~80 MB/s
to ~200 MB/s on my test machine but I'm aware that it is probably way too
aggressive and probably a different solution is needed. So I just wrote that
patch to see the results and spark some discussion :). Any idea how to
improve the waker logic so that dbench performance doesn't drop so
dramatically?

								Honza

Comments

Paolo Valente April 17, 2020, 7:47 a.m. UTC | #1
Hi Jan,
I'm glad you're addressing these BFQ issues.  Sorry for the delay, but
I happen to be working on similar issues, for other sort of
corner-case workloads.  So I wanted to consolidate my changes before
replying.

Probably, the best first step for me, to check your proposal, is to
merge it with my current changes, and test the outcome.  In this
respect, my problem is that, after our last improvements for dbench,
we cannot reproduce regressions any longer.  So, we would need your
support to test both issues, i.e., to test throughput with dbench (on
your side/machines), and possible other regressions of your and my
changes (on our side/machines).

Would it be ok for you to participate in this little collaboration?
If it is, then I'll contact you privately to kick this off.

Thanks,
Paolo

> Il giorno 9 apr 2020, alle ore 19:09, Jan Kara <jack@suse.cz> ha scritto:
> 
> Hello,
> 
> I was investigating why dbench performance (even for single dbench client) with
> BFQ is significantly worse than it used to be CFQ. The culprit is the idling
> logic in BFQ. The dbench workload is very fsync(2) heavy. In practice the time
> to complete fsync(2) calls is what determines the overall performance. For
> filesystems with a journal such as xfs or ext4 it is common that fsync(2)
> involves writing data from the process runningg fsync(2) - dbench in this case
> - and then waiting for the journalling machinery to flush out metadata from a
> separate process (jbd2 process in ext4 case, worker thread in xfs case).
> CFQ's heuristic was able to determine that it isn't worth to idle waiting for
> either dbench or jbd2 IO. BFQ's heuristic is not able to determine this and
> thus jbd2 process is often blocked waiting for idle timer of dbench queue to
> trigger.
> 
> The first patch in the series is an obvious bugfix but is not enough to improve
> performance. The second patch does improve dbench performance from ~80 MB/s
> to ~200 MB/s on my test machine but I'm aware that it is probably way too
> aggressive and probably a different solution is needed. So I just wrote that
> patch to see the results and spark some discussion :). Any idea how to
> improve the waker logic so that dbench performance doesn't drop so
> dramatically?
> 
> 								Honza
Jan Kara April 17, 2020, 9:18 a.m. UTC | #2
Hi Paolo!

On Fri 17-04-20 09:47:07, Paolo Valente wrote:
> I'm glad you're addressing these BFQ issues.  Sorry for the delay, but
> I happen to be working on similar issues, for other sort of
> corner-case workloads.  So I wanted to consolidate my changes before
> replying.
> 
> Probably, the best first step for me, to check your proposal, is to
> merge it with my current changes, and test the outcome.  In this
> respect, my problem is that, after our last improvements for dbench,
> we cannot reproduce regressions any longer.  So, we would need your
> support to test both issues, i.e., to test throughput with dbench (on
> your side/machines), and possible other regressions of your and my
> changes (on our side/machines).
> 
> Would it be ok for you to participate in this little collaboration?
> If it is, then I'll contact you privately to kick this off.

Sure, I can test whatever patches you send me on our machines.

								Honza

> 
> Thanks,
> Paolo
> 
> > Il giorno 9 apr 2020, alle ore 19:09, Jan Kara <jack@suse.cz> ha scritto:
> > 
> > Hello,
> > 
> > I was investigating why dbench performance (even for single dbench client) with
> > BFQ is significantly worse than it used to be CFQ. The culprit is the idling
> > logic in BFQ. The dbench workload is very fsync(2) heavy. In practice the time
> > to complete fsync(2) calls is what determines the overall performance. For
> > filesystems with a journal such as xfs or ext4 it is common that fsync(2)
> > involves writing data from the process runningg fsync(2) - dbench in this case
> > - and then waiting for the journalling machinery to flush out metadata from a
> > separate process (jbd2 process in ext4 case, worker thread in xfs case).
> > CFQ's heuristic was able to determine that it isn't worth to idle waiting for
> > either dbench or jbd2 IO. BFQ's heuristic is not able to determine this and
> > thus jbd2 process is often blocked waiting for idle timer of dbench queue to
> > trigger.
> > 
> > The first patch in the series is an obvious bugfix but is not enough to improve
> > performance. The second patch does improve dbench performance from ~80 MB/s
> > to ~200 MB/s on my test machine but I'm aware that it is probably way too
> > aggressive and probably a different solution is needed. So I just wrote that
> > patch to see the results and spark some discussion :). Any idea how to
> > improve the waker logic so that dbench performance doesn't drop so
> > dramatically?
> > 
> > 								Honza
>