Message ID | 87wpazh3rl.fsf@notabene.neil.brown.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Apr 5, 2017 at 12:33 PM, NeilBrown <neilb@suse.com> wrote: > > When a filesystem is mounted from a loop device, writes are > throttled by balance_dirty_pages() twice: once when writing > to the filesystem and once when the loop_handle_cmd() writes > to the backing file. This double-throttling can trigger > positive feedback loops that create significant delays. The > throttling at the lower level is seen by the upper level as > a slow device, so it throttles extra hard. > > The PF_LESS_THROTTLE flag was created to handle exactly this > circumstance, though with an NFS filesystem mounted from a > local NFS server. It reduces the throttling on the lower > layer so that it can proceed largely unthrottled. > > To demonstrate this, create a filesystem on a loop device > and write (e.g. with dd) several large files which combine > to consume significantly more than the limit set by > /proc/sys/vm/dirty_ratio or dirty_bytes. Measure the total > time taken. > > When I do this directly on a device (no loop device) the > total time for several runs (mkfs, mount, write 200 files, > umount) is fairly stable: 28-35 seconds. > When I do this over a loop device the times are much worse > and less stable. 52-460 seconds. Half below 100seconds, > half above. > When I apply this patch, the times become stable again, > though not as fast as the no-loop-back case: 53-72 seconds. > > There may be room for further improvement as the total overhead still > seems too high, but this is a big improvement. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Acked-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> > --- > > I moved where the flag is set, thanks to suggestion from > Ming Lei. > I've preserved the *-by: tags I was offered despite the code > being different, as the concept is identical. > > Thanks, > NeilBrown > > > drivers/block/loop.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 0ecb6461ed81..44b3506fd086 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -852,6 +852,7 @@ static int loop_prepare_queue(struct loop_device *lo) > if (IS_ERR(lo->worker_task)) > return -ENOMEM; > set_user_nice(lo->worker_task, MIN_NICE); > + lo->worker_task->flags |= PF_LESS_THROTTLE; > return 0; > } > > -- > 2.12.2 >
On Wed 05-04-17 14:33:50, NeilBrown wrote: > > When a filesystem is mounted from a loop device, writes are > throttled by balance_dirty_pages() twice: once when writing > to the filesystem and once when the loop_handle_cmd() writes > to the backing file. This double-throttling can trigger > positive feedback loops that create significant delays. The > throttling at the lower level is seen by the upper level as > a slow device, so it throttles extra hard. > > The PF_LESS_THROTTLE flag was created to handle exactly this > circumstance, though with an NFS filesystem mounted from a > local NFS server. It reduces the throttling on the lower > layer so that it can proceed largely unthrottled. > > To demonstrate this, create a filesystem on a loop device > and write (e.g. with dd) several large files which combine > to consume significantly more than the limit set by > /proc/sys/vm/dirty_ratio or dirty_bytes. Measure the total > time taken. > > When I do this directly on a device (no loop device) the > total time for several runs (mkfs, mount, write 200 files, > umount) is fairly stable: 28-35 seconds. > When I do this over a loop device the times are much worse > and less stable. 52-460 seconds. Half below 100seconds, > half above. > When I apply this patch, the times become stable again, > though not as fast as the no-loop-back case: 53-72 seconds. > > There may be room for further improvement as the total overhead still > seems too high, but this is a big improvement. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Acked-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: NeilBrown <neilb@suse.com> > --- > > I moved where the flag is set, thanks to suggestion from > Ming Lei. > I've preserved the *-by: tags I was offered despite the code > being different, as the concept is identical. > > Thanks, > NeilBrown > > > drivers/block/loop.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 0ecb6461ed81..44b3506fd086 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -852,6 +852,7 @@ static int loop_prepare_queue(struct loop_device *lo) > if (IS_ERR(lo->worker_task)) > return -ENOMEM; > set_user_nice(lo->worker_task, MIN_NICE); > + lo->worker_task->flags |= PF_LESS_THROTTLE; > return 0; As mentioned elsewhere, PF flags should be updated only on the current task otherwise there is potential rmw race. Is this safe? The code runs concurrently with the worker thread.
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 0ecb6461ed81..44b3506fd086 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -852,6 +852,7 @@ static int loop_prepare_queue(struct loop_device *lo) if (IS_ERR(lo->worker_task)) return -ENOMEM; set_user_nice(lo->worker_task, MIN_NICE); + lo->worker_task->flags |= PF_LESS_THROTTLE; return 0; }