Message ID | 1504603957-5389-16-git-send-email-yi.y.sun@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote: > This patch adds MBA description in related documents. > > Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> > Acked-by: Wei Liu <wei.liu2@citrix.com> > --- > v2: > - state the value type shown by 'psr-mba-show'. For linear mode, > it shows decimal value. For non-linear mode, it shows hexadecimal > value. > (suggested by Chao Peng) > --- > docs/man/xl.pod.1.in | 34 +++++++++++++++++++++++++ > docs/misc/xl-psr.markdown | 63 +++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 97 insertions(+) > > diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in > index 16c8306..e644b19 100644 > --- a/docs/man/xl.pod.1.in > +++ b/docs/man/xl.pod.1.in > @@ -1798,6 +1798,40 @@ processed. > > =back > > +=head2 Memory Bandwidth Allocation > + > +Intel Skylake and later server platforms offer capabilities to configure and > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides > +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop I don't get the 'closed-loop' thing again, but that might just be me since I'm not a native speaker. > +control system via exposing control over a credit-based throttling mechanism. > +In the Xen implementation, MBA is used to control memory bandwidth on VM basis. > +To enforce bandwidth on a specific domain, just set throttling value (THRTL) > +for the domain. > + > +=over 4 > + > +=item B<psr-mba-set> [I<OPTIONS>] I<domain-id> I<thrtl> > + > +Set throttling value (THRTL) for a domain. For how to specify I<thrtl> > +please refer to L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>. > + > +B<OPTIONS> > + > +=over 4 > + > +=item B<-s SOCKET>, B<--socket=SOCKET> > + > +Specify the socket to process, otherwise all sockets are processed. > + > +=back > + > +=item B<psr-mba-show> [I<domain-id>] > + > +Show MBA settings for a certain domain or all domains. For linear mode, it > +shows the decimal value. For non-linear mode, it shows hexadecimal value. > + > +=back > + > =head1 IGNORED FOR COMPATIBILITY WITH XM > > xl is mostly command-line compatible with the old xm utility used with > diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown > index 04dd957..39fc801 100644 > --- a/docs/misc/xl-psr.markdown > +++ b/docs/misc/xl-psr.markdown > @@ -186,6 +186,69 @@ Setting data CBM for a domain: > Setting the same code and data CBM for a domain: > `xl psr-cat-set <domid> <cbm>` > > +## Memory Bandwidth Allocation (MBA) > + > +Memory Bandwidth Allocation (MBA) is a new feature available on Intel > +Skylake and later server platforms that allows an OS or Hypervisor/VMM to > +slow misbehaving apps/VMs or create advanced closed-loop control system via > +exposing control over a credit-based throttling mechanism. To enforce bandwidth > +on a specific domain, just set throttling value (THRTL) into Class of Service > +(COS). MBA provides two THRTL mode. One is linear mode and the other is > +non-linear mode. > + > +In the linear mode the input precision is defined as 100-(THRTL_MAX). Values > +not an even multiple of the precision (e.g., 12%) will be rounded down (e.g., > +to 10% delay applied). ^ s/applied/by the hardware/ Thanks, Roger.
On 17-09-19 12:37:24, Roger Pau Monn� wrote: > On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote: > > +Intel Skylake and later server platforms offer capabilities to configure and > > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides > > +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop > > I don't get the 'closed-loop' thing again, but that might just be me > since I'm not a native speaker. > Will modify this to be same as feature doc. [...] > > +In the linear mode the input precision is defined as 100-(THRTL_MAX). Values > > +not an even multiple of the precision (e.g., 12%) will be rounded down (e.g., > > +to 10% delay applied). > ^ s/applied/by the hardware/ > Thanks! > Thanks, Roger.
On Tue, 2017-09-19 at 12:37 +0100, Roger Pau Monné wrote: > On Tue, Sep 05, 2017 at 05:32:37PM +0800, Yi Sun wrote: > > > > --- a/docs/man/xl.pod.1.in > > +++ b/docs/man/xl.pod.1.in > > @@ -1798,6 +1798,40 @@ processed. > > > > =back > > > > +=head2 Memory Bandwidth Allocation > > + > > +Intel Skylake and later server platforms offer capabilities to > > configure and > > +make use of the Memory Bandwidth Allocation (MBA) mechanisms, > > which provides > > +OS/VMMs the ability to slow misbehaving apps/VMs or create > > advanced closed-loop > > I don't get the 'closed-loop' thing again, but that might just be me > since I'm not a native speaker. > > > +control system via exposing control over a credit-based throttling > > mechanism. > It goes together with 'control system'. In fact, 'closed-loop control system' is a concept from control theory (or system automation, or system theory... I've head it called in all these ways). It's when you want to control a system, or a process, and you do it by enclosing it in a "loop" in such a way that the n+1-th input to the process is influenced by the n-th output of the process itself. It's also called 'feedback-loop' or 'feedback-based control system'. Basically, you usually read/measure/sense the n-th output of the process, you compare it with some 'desired' value, and you use --as the process' n+1-th input-- some indication of how different the measured value was from the desired value. http://www.electronics-tutorials.ws/systems/closed-loop-system.html Alternatively, you have 'open-loop control systems', where there is no sensing of the output, and no feedback mechanism that would correct the input according to how things are actually going (i.e., someone says, there is no control!). http://www.electronics-tutorials.ws/systems/open-loop-system.html *I guess* what this means, in this context, is that, with both MBA and MBM, you can build a piece of software that, given a desired memory bandwidth usage, for a certain domain, sets MBA accordingly, then monitors what the domain is actually getting, and use the difference between that and the desired value to drive the new value to be set, using MBA again. Like, if it's getting less, give it _some_ more, if it's getting more, give it _some_ less (where both the _some_-s are coefficients). Ideally, after initial spikes and fluctuations (which depends on the coefficients, and on which one can do math, still using control theory concepts), happening, e.g., when the workload inside the VM changes, the bandwidth utilization will settle at the desired point. All that being said, I'd say that either more details are given (or a link is put here, pointing to a whitepaper or in general a place where a full description of the solution can be found), or it's probably better to drop the 'close-loop' reference, and explain how MBA can be useful in another way. Regards, Dario
diff --git a/docs/man/xl.pod.1.in b/docs/man/xl.pod.1.in index 16c8306..e644b19 100644 --- a/docs/man/xl.pod.1.in +++ b/docs/man/xl.pod.1.in @@ -1798,6 +1798,40 @@ processed. =back +=head2 Memory Bandwidth Allocation + +Intel Skylake and later server platforms offer capabilities to configure and +make use of the Memory Bandwidth Allocation (MBA) mechanisms, which provides +OS/VMMs the ability to slow misbehaving apps/VMs or create advanced closed-loop +control system via exposing control over a credit-based throttling mechanism. +In the Xen implementation, MBA is used to control memory bandwidth on VM basis. +To enforce bandwidth on a specific domain, just set throttling value (THRTL) +for the domain. + +=over 4 + +=item B<psr-mba-set> [I<OPTIONS>] I<domain-id> I<thrtl> + +Set throttling value (THRTL) for a domain. For how to specify I<thrtl> +please refer to L<http://xenbits.xen.org/docs/unstable/misc/xl-psr.html>. + +B<OPTIONS> + +=over 4 + +=item B<-s SOCKET>, B<--socket=SOCKET> + +Specify the socket to process, otherwise all sockets are processed. + +=back + +=item B<psr-mba-show> [I<domain-id>] + +Show MBA settings for a certain domain or all domains. For linear mode, it +shows the decimal value. For non-linear mode, it shows hexadecimal value. + +=back + =head1 IGNORED FOR COMPATIBILITY WITH XM xl is mostly command-line compatible with the old xm utility used with diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown index 04dd957..39fc801 100644 --- a/docs/misc/xl-psr.markdown +++ b/docs/misc/xl-psr.markdown @@ -186,6 +186,69 @@ Setting data CBM for a domain: Setting the same code and data CBM for a domain: `xl psr-cat-set <domid> <cbm>` +## Memory Bandwidth Allocation (MBA) + +Memory Bandwidth Allocation (MBA) is a new feature available on Intel +Skylake and later server platforms that allows an OS or Hypervisor/VMM to +slow misbehaving apps/VMs or create advanced closed-loop control system via +exposing control over a credit-based throttling mechanism. To enforce bandwidth +on a specific domain, just set throttling value (THRTL) into Class of Service +(COS). MBA provides two THRTL mode. One is linear mode and the other is +non-linear mode. + +In the linear mode the input precision is defined as 100-(THRTL_MAX). Values +not an even multiple of the precision (e.g., 12%) will be rounded down (e.g., +to 10% delay applied). + +If linear values are not supported then input delay values are powers-of-two +from zero to the THRTL_MAX value from CPUID. In this case any values not a power +of two will be rounded down the next nearest power of two. + +For example, assuming a system with 2 domains: + + * A THRTL of 0x0 for every domain means each domain can access the whole cache + without any delay. This is the default. + + * Linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 means + that the first domain gets 10% delay to access the cache and the other one + without any delay. + + * Non-linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 + means that the first domain gets 8% delay to access the cache and the other + one without any delay. + +For more detailed information please refer to Intel SDM chapter +"Introduction to Memory Bandwidth Allocation". + +In Xen's implementation, THRTL can be configured with libxl/xl interfaces but +COS is maintained in hypervisor only. The cache partition granularity is per +domain, each domain has COS=0 assigned by default, the corresponding THRTL is +0, which means all the cache resource can be accessed without delay. + +### xl interfaces + +System MBA information such as maximum COS and maximum THRTL can be obtained by: + +`xl psr-hwinfo --mba` + +The simplest way to change a domain's THRTL from its default is running: + +`xl psr-mba-set [OPTIONS] <domid> <thrtl>` + +In a multi-socket system, the same thrtl will be set on each socket by default. +Per socket thrtl can be specified with the `--socket SOCKET` option. + +Setting the THRTL may not be successful if insufficient COS is available. In +such case unused COS(es) may be freed by setting THRTL of all related domains to +its default value(0). + +Per domain THRTL settings can be shown by: + +`xl psr-mba-show [OPTIONS] <domid>` + +For linear mode, it shows the decimal value. For non-linear mode, it shows +hexadecimal value. + ## Reference [1] Intel SDM