Message ID | 56fe58ac90ed99e5c8dd90a9d4f7bcbb730fd4ee.1454669823.git.berto@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 05.02.2016 um 11:59 hat Alberto Garcia geschrieben: > This patch adds support for burst periods to the throttling code. > With this feature the user can keep performing bursts as defined by > the LeakyBucket.max rate for a configurable period of time. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > include/qemu/throttle.h | 41 +++++++++++++++++++++++---- > util/throttle.c | 73 ++++++++++++++++++++++++++++++++++++++++--------- > 2 files changed, 96 insertions(+), 18 deletions(-) > > diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h > index 8ec8951..63df690 100644 > --- a/include/qemu/throttle.h > +++ b/include/qemu/throttle.h > @@ -2,7 +2,7 @@ > * QEMU throttling infrastructure > * > * Copyright (C) Nodalink, EURL. 2013-2014 > - * Copyright (C) Igalia, S.L. 2015 > + * Copyright (C) Igalia, S.L. 2015-2016 > * > * Authors: > * Benoît Canet <benoit.canet@nodalink.com> > @@ -42,16 +42,47 @@ typedef enum { > } BucketType; > > /* > - * The max parameter of the leaky bucket throttling algorithm can be used to > - * allow the guest to do bursts. > - * The max value is a pool of I/O that the guest can use without being throttled > - * at all. Throttling is triggered once this pool is empty. > + * This module implements I/O limits using the leaky bucket > + * algorithm. The code is independent of the I/O units, but it is > + * currently used for bytes per second and operations per second. > + * > + * Three parameters can be set by the user: > + * > + * - avg: the desired I/O limits in units per second. > + * - max: the limit during bursts, also in units per second. > + * - burst_length: the maximum length of the burst period, in seconds. > + * > + * Here's how it works: > + * > + * - The bucket level (number of performed I/O units) is kept in > + * bkt.level and leaks at a rate of bkt.avg units per second. > + * > + * - The size of the bucket is bkt.max * bkt.burst_length. Once the > + * bucket is full no more I/O is performed until the bucket leaks > + * again. This is what makes the I/O rate bkt.avg. > + * > + * - The bkt.avg rate does not apply until the bucket is full, > + * allowing the user to do bursts until then. The I/O limit during > + * bursts is bkt.max. To enforce this limit we keep an additional > + * bucket in bkt.burst_length that leaks at a rate of bkt.max units > + * per second. > + * > + * - Because of all of the above, the user can perform I/O at a > + * maximum of bkt.max units per second for at most bkt.burst_length > + * seconds in a row. After that the bucket will be full and the I/O > + * rate will go down to bkt.avg. > + * > + * - Since the bucket always leaks at a rate of bkt.avg, this also > + * determines how much the user needs to wait before being able to > + * do bursts again. > */ Good summary, thanks! > typedef struct LeakyBucket { > double avg; /* average goal in units per second */ > double max; /* leaky bucket max burst in units */ > double level; /* bucket level in units */ > + double burst_level; /* bucket level in units (for computing bursts) */ > + unsigned burst_length; /* max length of the burst period, in seconds */ > } LeakyBucket; > > /* The following structure is used to configure a ThrottleState > diff --git a/util/throttle.c b/util/throttle.c > index 6a01cee..371c769 100644 > --- a/util/throttle.c > +++ b/util/throttle.c > @@ -41,6 +41,14 @@ void throttle_leak_bucket(LeakyBucket *bkt, int64_t delta_ns) > > /* make the bucket leak */ > bkt->level = MAX(bkt->level - leak, 0); > + > + /* if we allow bursts for more than one second we also need to > + * keep track of bkt->burst_level so the bkt->max goal per second > + * is attained */ > + if (bkt->burst_length > 1) { > + leak = (bkt->max * (double) delta_ns) / NANOSECONDS_PER_SECOND; > + bkt->burst_level = MAX(bkt->burst_level - leak, 0); > + } > } > > /* Calculate the time delta since last leak and make proportionals leaks > @@ -91,13 +99,24 @@ int64_t throttle_compute_wait(LeakyBucket *bkt) > return 0; > } > > - extra = bkt->level - bkt->max; > + /* If the bucket is full then we have to wait */ > + extra = bkt->level - bkt->max * bkt->burst_length; > + if (extra > 0) { > + return throttle_do_compute_wait(bkt->avg, extra); > + } > > - if (extra <= 0) { > - return 0; > + /* If the bucket is not full yet we have to make sure that we > + * fulfill the goal of bkt->max units per second. */ > + if (bkt->burst_length > 1) { > + /* We use 1/10 of the max value to smooth the throttling. > + * See throttle_fix_bucket() for more details. */ > + extra = bkt->burst_level - bkt->max / 10; I don't understand the connection between throttle_fix_bucket() and this. throttle_fix_bucket() seems to set a default rate for bursts, which kind of makes sense to me (but what's the point when this is lower than the average rate?) Here we work on a bkt->max that is either supplied by the user and should therefore be respected, or the default in throttle_fix_bucket() has already been applied. What this line does is letting the request wait for more than would be strictly necessary, or in other words, for the last second before the bucket runs full, you only allow a tenth of the actual maximum rate. I understand that having any burst at all helps, so the default that throttle_fix_bucket() sets used to make sense. I'm not so sure that it still makes sense with its max < avg setting (max used to be additional units on top of avg, now it's measured on its own). For the divison by 10 here, however, I'm still puzzled. What am I missing? > + if (extra > 0) { > + return throttle_do_compute_wait(bkt->max, extra); > + } > } > > - return throttle_do_compute_wait(bkt->avg, extra); > + return 0; > } Kevin
On Tue 16 Feb 2016 11:45:32 AM CET, Kevin Wolf <kwolf@redhat.com> wrote: >> + /* If the bucket is not full yet we have to make sure that we >> + * fulfill the goal of bkt->max units per second. */ >> + if (bkt->burst_length > 1) { >> + /* We use 1/10 of the max value to smooth the throttling. >> + * See throttle_fix_bucket() for more details. */ >> + extra = bkt->burst_level - bkt->max / 10; > > I don't understand the connection between throttle_fix_bucket() and > this. > > throttle_fix_bucket() seems to set a default rate for bursts, which > kind of makes sense to me (but what's the point when this is lower > than the average rate?) The point is to smoothen the throttling: if you request a max rate of 2000 operations per second, what you actually get is a rate of 200 operations per tenth of a second. With the current code, if you set bkg->avg to 1000 but not bkt->max you'll get exactly this. I'm just applying the same logic to bursts. Berto
diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h index 8ec8951..63df690 100644 --- a/include/qemu/throttle.h +++ b/include/qemu/throttle.h @@ -2,7 +2,7 @@ * QEMU throttling infrastructure * * Copyright (C) Nodalink, EURL. 2013-2014 - * Copyright (C) Igalia, S.L. 2015 + * Copyright (C) Igalia, S.L. 2015-2016 * * Authors: * Benoît Canet <benoit.canet@nodalink.com> @@ -42,16 +42,47 @@ typedef enum { } BucketType; /* - * The max parameter of the leaky bucket throttling algorithm can be used to - * allow the guest to do bursts. - * The max value is a pool of I/O that the guest can use without being throttled - * at all. Throttling is triggered once this pool is empty. + * This module implements I/O limits using the leaky bucket + * algorithm. The code is independent of the I/O units, but it is + * currently used for bytes per second and operations per second. + * + * Three parameters can be set by the user: + * + * - avg: the desired I/O limits in units per second. + * - max: the limit during bursts, also in units per second. + * - burst_length: the maximum length of the burst period, in seconds. + * + * Here's how it works: + * + * - The bucket level (number of performed I/O units) is kept in + * bkt.level and leaks at a rate of bkt.avg units per second. + * + * - The size of the bucket is bkt.max * bkt.burst_length. Once the + * bucket is full no more I/O is performed until the bucket leaks + * again. This is what makes the I/O rate bkt.avg. + * + * - The bkt.avg rate does not apply until the bucket is full, + * allowing the user to do bursts until then. The I/O limit during + * bursts is bkt.max. To enforce this limit we keep an additional + * bucket in bkt.burst_length that leaks at a rate of bkt.max units + * per second. + * + * - Because of all of the above, the user can perform I/O at a + * maximum of bkt.max units per second for at most bkt.burst_length + * seconds in a row. After that the bucket will be full and the I/O + * rate will go down to bkt.avg. + * + * - Since the bucket always leaks at a rate of bkt.avg, this also + * determines how much the user needs to wait before being able to + * do bursts again. */ typedef struct LeakyBucket { double avg; /* average goal in units per second */ double max; /* leaky bucket max burst in units */ double level; /* bucket level in units */ + double burst_level; /* bucket level in units (for computing bursts) */ + unsigned burst_length; /* max length of the burst period, in seconds */ } LeakyBucket; /* The following structure is used to configure a ThrottleState diff --git a/util/throttle.c b/util/throttle.c index 6a01cee..371c769 100644 --- a/util/throttle.c +++ b/util/throttle.c @@ -41,6 +41,14 @@ void throttle_leak_bucket(LeakyBucket *bkt, int64_t delta_ns) /* make the bucket leak */ bkt->level = MAX(bkt->level - leak, 0); + + /* if we allow bursts for more than one second we also need to + * keep track of bkt->burst_level so the bkt->max goal per second + * is attained */ + if (bkt->burst_length > 1) { + leak = (bkt->max * (double) delta_ns) / NANOSECONDS_PER_SECOND; + bkt->burst_level = MAX(bkt->burst_level - leak, 0); + } } /* Calculate the time delta since last leak and make proportionals leaks @@ -91,13 +99,24 @@ int64_t throttle_compute_wait(LeakyBucket *bkt) return 0; } - extra = bkt->level - bkt->max; + /* If the bucket is full then we have to wait */ + extra = bkt->level - bkt->max * bkt->burst_length; + if (extra > 0) { + return throttle_do_compute_wait(bkt->avg, extra); + } - if (extra <= 0) { - return 0; + /* If the bucket is not full yet we have to make sure that we + * fulfill the goal of bkt->max units per second. */ + if (bkt->burst_length > 1) { + /* We use 1/10 of the max value to smooth the throttling. + * See throttle_fix_bucket() for more details. */ + extra = bkt->burst_level - bkt->max / 10; + if (extra > 0) { + return throttle_do_compute_wait(bkt->max, extra); + } } - return throttle_do_compute_wait(bkt->avg, extra); + return 0; } /* This function compute the time that must be waited while this IO @@ -177,7 +196,11 @@ void throttle_timers_attach_aio_context(ThrottleTimers *tt, */ void throttle_config_init(ThrottleConfig *cfg) { + unsigned i; memset(cfg, 0, sizeof(*cfg)); + for (i = 0; i < BUCKETS_COUNT; i++) { + cfg->buckets[i].burst_length = 1; + } } /* To be called first on the ThrottleState */ @@ -301,6 +324,16 @@ bool throttle_is_valid(ThrottleConfig *cfg, Error **errp) return false; } + if (!cfg->buckets[i].burst_length) { + error_setg(errp, "the burst length cannot be 0"); + return false; + } + + if (cfg->buckets[i].burst_length > 1 && !cfg->buckets[i].max) { + error_setg(errp, "burst length set without burst rate"); + return false; + } + if (cfg->buckets[i].max && !cfg->buckets[i].avg) { error_setg(errp, "bps_max/iops_max require corresponding" " bps/iops values"); @@ -317,7 +350,7 @@ static void throttle_fix_bucket(LeakyBucket *bkt) double min; /* zero bucket level */ - bkt->level = 0; + bkt->level = bkt->burst_level = 0; /* The following is done to cope with the Linux CFQ block scheduler * which regroup reads and writes by block of 100ms in the guest. @@ -420,22 +453,36 @@ bool throttle_schedule_timer(ThrottleState *ts, */ void throttle_account(ThrottleState *ts, bool is_write, uint64_t size) { + const BucketType bucket_types_size[2][2] = { + { THROTTLE_BPS_TOTAL, THROTTLE_BPS_READ }, + { THROTTLE_BPS_TOTAL, THROTTLE_BPS_WRITE } + }; + const BucketType bucket_types_units[2][2] = { + { THROTTLE_OPS_TOTAL, THROTTLE_OPS_READ }, + { THROTTLE_OPS_TOTAL, THROTTLE_OPS_WRITE } + }; double units = 1.0; + unsigned i; /* if cfg.op_size is defined and smaller than size we compute unit count */ if (ts->cfg.op_size && size > ts->cfg.op_size) { units = (double) size / ts->cfg.op_size; } - ts->cfg.buckets[THROTTLE_BPS_TOTAL].level += size; - ts->cfg.buckets[THROTTLE_OPS_TOTAL].level += units; + for (i = 0; i < 2; i++) { + LeakyBucket *bkt; - if (is_write) { - ts->cfg.buckets[THROTTLE_BPS_WRITE].level += size; - ts->cfg.buckets[THROTTLE_OPS_WRITE].level += units; - } else { - ts->cfg.buckets[THROTTLE_BPS_READ].level += size; - ts->cfg.buckets[THROTTLE_OPS_READ].level += units; + bkt = &ts->cfg.buckets[bucket_types_size[is_write][i]]; + bkt->level += size; + if (bkt->burst_length > 1) { + bkt->burst_level += size; + } + + bkt = &ts->cfg.buckets[bucket_types_units[is_write][i]]; + bkt->level += units; + if (bkt->burst_length > 1) { + bkt->burst_level += units; + } } }
This patch adds support for burst periods to the throttling code. With this feature the user can keep performing bursts as defined by the LeakyBucket.max rate for a configurable period of time. Signed-off-by: Alberto Garcia <berto@igalia.com> --- include/qemu/throttle.h | 41 +++++++++++++++++++++++---- util/throttle.c | 73 ++++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 96 insertions(+), 18 deletions(-)