mbox series

[0/5] writeback: Fix bandwidth estimates

Message ID 20210705161610.19406-1-jack@suse.cz (mailing list archive)
Headers show
Series writeback: Fix bandwidth estimates | expand

Message

Jan Kara July 5, 2021, 4:23 p.m. UTC
Hello,

this patch series fixes estimate of writeback throughput when device is not
fully busy doing writeback. Michael Stapelberg has reported that such
workload (e.g. generated by linking) tends to push estimated throughput down
to 0 and as a result writeback on the device is practically stalled.

The first three patches fix the reported issue, the remaining two patches
are unrelated cleanups of problems I've noticed when reading the code.

Michael, can you try whether these patches fix the problems you've observed?
I cannot trigger the "bandwidth going down" spiral with them anymore.

								Honza

Comments

Michael Stapelberg July 9, 2021, 1:19 p.m. UTC | #1
Thanks for sending this patch series!

I have used the mmap.c reproducer as before, with the following parameters:
* mkdir /tmp/mnt
* fusermount -u /tmp/mnt; /root/fuse-2.9.9/example/fusexmp_fh -f /tmp/mnt
* dd if=/dev/urandom of=/tmp/was bs=1M count=99
* while :; do grep ^Bdi /sys/kernel/debug/bdi/0:44/stats; sleep 0.1; done
* while :; do time WORKAROUND=1 ~/mmap /tmp/was
/tmp/mnt/tmp/stapelberg.1; sleep 5; done

Previously, after a few iterations, the BdiWriteBandwidth measure
would gradually approach 0.

With your patch series applied, the BdiWriteBandwidth is updated much
more quickly, and converges to ≈16000 kBps.
When I start copying more quickly, the bandwidth measure rises quickly.

As far as I understand, this should fix the problem (provided 16000
kBps is an okay value).
Certainly, I don’t see the downward spiral either with your patches :)


On Mon, 5 Jul 2021 at 18:23, Jan Kara <jack@suse.cz> wrote:
>
> Hello,
>
> this patch series fixes estimate of writeback throughput when device is not
> fully busy doing writeback. Michael Stapelberg has reported that such
> workload (e.g. generated by linking) tends to push estimated throughput down
> to 0 and as a result writeback on the device is practically stalled.
>
> The first three patches fix the reported issue, the remaining two patches
> are unrelated cleanups of problems I've noticed when reading the code.
>
> Michael, can you try whether these patches fix the problems you've observed?
> I cannot trigger the "bandwidth going down" spiral with them anymore.
>
>                                                                 Honza
Jan Kara July 12, 2021, 4:27 p.m. UTC | #2
On Fri 09-07-21 15:19:17, Michael Stapelberg wrote:
> Thanks for sending this patch series!
> 
> I have used the mmap.c reproducer as before, with the following parameters:
> * mkdir /tmp/mnt
> * fusermount -u /tmp/mnt; /root/fuse-2.9.9/example/fusexmp_fh -f /tmp/mnt
> * dd if=/dev/urandom of=/tmp/was bs=1M count=99
> * while :; do grep ^Bdi /sys/kernel/debug/bdi/0:44/stats; sleep 0.1; done
> * while :; do time WORKAROUND=1 ~/mmap /tmp/was
> /tmp/mnt/tmp/stapelberg.1; sleep 5; done
> 
> Previously, after a few iterations, the BdiWriteBandwidth measure
> would gradually approach 0.
> 
> With your patch series applied, the BdiWriteBandwidth is updated much
> more quickly, and converges to ≈16000 kBps.
> When I start copying more quickly, the bandwidth measure rises quickly.
> 
> As far as I understand, this should fix the problem (provided 16000
> kBps is an okay value).
> Certainly, I don’t see the downward spiral either with your patches :)

Thanks for testing! Can I add your Tested-by tag?

								Honza
Michael Stapelberg July 13, 2021, 8:15 a.m. UTC | #3
Yes! Thank you.

On Mon, 12 Jul 2021 at 18:27, Jan Kara <jack@suse.cz> wrote:

> On Fri 09-07-21 15:19:17, Michael Stapelberg wrote:
> > Thanks for sending this patch series!
> >
> > I have used the mmap.c reproducer as before, with the following
> parameters:
> > * mkdir /tmp/mnt
> > * fusermount -u /tmp/mnt; /root/fuse-2.9.9/example/fusexmp_fh -f /tmp/mnt
> > * dd if=/dev/urandom of=/tmp/was bs=1M count=99
> > * while :; do grep ^Bdi /sys/kernel/debug/bdi/0:44/stats; sleep 0.1; done
> > * while :; do time WORKAROUND=1 ~/mmap /tmp/was
> > /tmp/mnt/tmp/stapelberg.1; sleep 5; done
> >
> > Previously, after a few iterations, the BdiWriteBandwidth measure
> > would gradually approach 0.
> >
> > With your patch series applied, the BdiWriteBandwidth is updated much
> > more quickly, and converges to ≈16000 kBps.
> > When I start copying more quickly, the bandwidth measure rises quickly.
> >
> > As far as I understand, this should fix the problem (provided 16000
> > kBps is an okay value).
> > Certainly, I don’t see the downward spiral either with your patches :)
>
> Thanks for testing! Can I add your Tested-by tag?
>
>                                                                 Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>