diff mbox

[RESEND,v5,01/24] docs: create L2 Cache Allocation Technology (CAT) feature document

Message ID 1484805686-7249-2-git-send-email-yi.y.sun@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yi Sun Jan. 19, 2017, 6:01 a.m. UTC
This patch creates L2 CAT feature document in doc/features/.
It describes details of L2 CAT.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
 docs/features/intel_psr_l2_cat.pandoc | 347 ++++++++++++++++++++++++++++++++++
 1 file changed, 347 insertions(+)
 create mode 100644 docs/features/intel_psr_l2_cat.pandoc

Comments

Tian, Kevin Jan. 20, 2017, 9:39 a.m. UTC | #1
> From: Yi Sun

> Sent: Thursday, January 19, 2017 2:01 PM

> 

> This patch creates L2 CAT feature document in doc/features/.

> It describes details of L2 CAT.


A good write-up, but still some improvements required. :-)

> 

> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>

> ---

>  docs/features/intel_psr_l2_cat.pandoc | 347

> ++++++++++++++++++++++++++++++++++

>  1 file changed, 347 insertions(+)

>  create mode 100644 docs/features/intel_psr_l2_cat.pandoc

> 

> diff --git a/docs/features/intel_psr_l2_cat.pandoc

> b/docs/features/intel_psr_l2_cat.pandoc

> new file mode 100644

> index 0000000..77bd61f

> --- /dev/null

> +++ b/docs/features/intel_psr_l2_cat.pandoc

> @@ -0,0 +1,347 @@

> +% Intel L2 Cache Allocation Technology (L2 CAT) Feature

> +% Revision 1.0

> +

> +\clearpage

> +

> +# Basics

> +

> +---------------- ----------------------------------------------------

> +         Status: **Tech Preview**

> +

> +Architecture(s): Intel x86

> +

> +   Component(s): Hypervisor, toolstack

> +

> +       Hardware: Atom codename Goldmont and beyond CPUs

> +---------------- ----------------------------------------------------

> +

> +# Overview

> +

> +L2 CAT allows an OS or Hypervisor/VMM to control allocation of a

> +CPU's shared L2 cache based on application priority or Class of Service

> +(COS). Each CLOS is configured using capacity bitmasks (CBM) which

> +represent cache capacity and indicate the degree of overlap and

> +isolation between classes. Once L2 CAT is configured, the processor

> +allows access to portions of L2 cache according to the established

> +class of service.


I would suggest make this doc for all CAT features, otherwise some
content looks incomplete when you say adding new options or new
ranges with assumption that reader understands the existing stuff.
Yes I do notice you provide many background about L3 CAT/CDP. Let's
just make it complete.

Also as a permanent design, 'new' is not preferred since it will become
'existing' once it's checked in.

> +

> +## Terminology

> +

> +* CAT         Cache Allocation Technology

> +* CBM         Capacity BitMasks

> +* CDP         Code and Data Prioritization

> +* COS/CLOS    Class of Service

> +* MSRs        Machine Specific Registers

> +* PSR         Intel Platform Shared Resource

> +* VMM         Virtual Machine Monitor

> +

> +# User details

> +

> +* Feature Enabling:

> +

> +  Add "psr=cat" to boot line parameter to enable all supported level CAT

> +  features.

> +

> +* xl interfaces:

> +

> +  1. `psr-cat-show [OPTIONS] domain-id`:


Need an introduction of 'domain-id'. Is it a Xen domain ID, or
new hardware-defined domain ID (e.g. one socket per domain)?

> +

> +     Show domain L2 or L3 CAT CBM.


also please introduce how CBM actually works

> +

> +     New option `-l` is added.

> +     `-l2`: Show cbm for L2 cache.

> +     `-l3`: Show cbm for L3 cache.

> +

> +     If neither `-l2` nor `-l3` is given, show both of them. If any one

> +     is not supported, will print error info.

> +

> +  2. `psr-cat-cbm-set [OPTIONS] domain-id cbm`:

> +

> +     Set domain L2 or L3 CBM.


to be consistent with 'show' - 'CBM' or 'CAT CBM'?

I understand those xl interfaces are there already. If there is chance
to change, I'd recommend removing 'cbm'. Just psr-cat-set to be
consistent with 'show'.

> +

> +     New option `-l` is added.

> +     `-l2`: Specify cbm for L2 cache.

> +     `-l3`: Specify cbm for L3 cache.

> +

> +     If neither `-l2` nor `-l3` is given, level 3 is the default option.

> +

> +  3. `psr-hwinfo [OPTIONS]`:

> +

> +     Show L2 & L3 CAT HW informations on every socket.


informations -> information

and which HW information? please elaborate.

and where is the interface of setting COS for VCPU?

> +

> +# Technical details

> +

> +L2 CAT is a member of Intel PSR features and part of CAT, it shares

> +some base PSR infrastructure in Xen.


Then add some background of PSR would be useful if nothing exists yet?

> +

> +## Hardware perspective

> +

> +L2 CAT defines a new range MSRs to assign different L2 cache access

> +patterns which are known as CBMs, each CBM is associated with a COS.

> +

> +```

> +

> +                        +----------------------------+----------------+

> +   IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |

> + +----+---+-------+     +----------------------------+----------------+

> + |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |

> + +----+---+-------+     +----------------------------+----------------+

> +        └-------------> | ...                        |  ...           |

> +                        +----------------------------+----------------+

> +                        | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |

> +                        +----------------------------+----------------+

> +```

> +

> +When context switch happens, the COS of VCPU is written to per-thread

> +MSR `IA32_PQR_ASSOC`, and then hardware enforces L2 cache allocation

> +according to the corresponding CBM.

> +

> +## The relationship between L2 CAT and L3 CAT/CDP

> +

> +L2 CAT is independent of L3 CAT/CDP, which means L2 CAT would be enabled

> +while L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are all enabled.

> +

> +L2 CAT uses a new range CBMs from 0xD10 ~ 0xD10+n (n<64), following by

> +the L3 CAT/CDP CBMs, and supports setting different L2 cache accessing

> +patterns from L3 cache. 


"L2 CAT supports ..." or "L3 CAT supports..." for the last sentence? 

> Like L3 CAT/CDP requirement, the bits of CBM of

> +L2 CAT must be continuous too.

> +

> +N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same

> +associate register `IA32_PQR_ASSOC`, which means one COS associates to a

> +pair of L2 CBM and L3 CBM.

> +

> +Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or

> +other PSR features in future). In some cases, a VM is permitted to have a

> +COS that is beyond one (or more) of PSR features but within the others.

> +For instance, let's assume the max COS of L2 CAT is 8 but the max COS of

> +L3 CAT is 16, when a VM is assigned 9 as COS, the L3 CBM associated to

> +COS 9 would be enforced, but for L2 CAT, the behavior is fully open (no

> +limit) since COS 9 is beyond the max COS (8) of L2 CAT.


Does user space need to know such difference since L2 CAT may not 
be effective for this vCPU?
.."

> +

> +## Design Overview

> +

> +* Core COS/CBM association

> +

> +  When enforcing L2 CAT, all cores of domains have the same default

> +  COS (COS0) which associated to the fully open CBM (all ones bitmask)


please define 'open' when introducing CBM

> +  to access all L2 cache. The default COS is used only in hypervisor

> +  and is transparent to tool stack and user.

> +

> +  System administrator can change PSR allocation policy at runtime by

> +  tool stack. Since L2 CAT share COS with L3 CAT/CDP, a COS corresponds

> +  to a 2-tuple, like [L2 CBM, L3 CBM] with only-CAT enabled, when CDP

> +  is enabled, one COS corresponds to a 3-tuple, like [L2 CBM,

> +  L3 Code_CBM, L3 Data_CBM]. If neither L3 CAT nor L3 CDP is enabled,

> +  things would be easier, one COS corresponds to one L2 CBM.

> +

> +* VCPU schedule

> +

> +  This part reuses L3 CAT COS infrastructure.


Please elaborate then.

> +

> +* Multi-sockets

> +

> +  Different sockets may have different L2 CAT capability (e.g. max COS)

> +  although it is consistent on the same socket. So the capability of

> +  per-socket L2 CAT is specified.


VCPU schedule design is important here. e.g. when migrating a VCPU 
from socket 1 to socket 2 (socket 1 has max COS as 16 while socket
2 as 8), will you prevent migration if VCPU has a COS (9)?

> +

> +## Implementation Description

> +

> +* Hypervisor interfaces:

> +

> +  1. Boot line parameter "psr=cat" now will enable L2 CAT and L3

> +     CAT if hardware supported.

> +

> +  2. SYSCTL:

> +          - XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.

> +

> +  3. DOMCTL:

> +          - XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.

> +          - XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.

> +

> +* xl interfaces:

> +

> +  1. psr-cat-show -l2 domain-id

> +          Show L2 cbm for a domain.

> +          => XEN_SYSCTL_PSR_CAT_get_l2_info /

> +             XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM

> +

> +  2. psr-mba-set -l2 domain-id cbm

> +          Set L2 cbm for a domain.

> +          => XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM

> +

> +  3. psr-hwinfo

> +          Show PSR HW information, including L2 CAT

> +          => XEN_SYSCTL_PSR_CAT_get_l2_info


still no interface about setting COS.

> +

> +* Key data structure:

> +

> +   1. Feature HW info

> +

> +      ```

> +      struct psr_cat_hw_info {

> +          unsigned int cbm_len;

> +          unsigned int cos_max;

> +      };

> +      ```

> +

> +      - Member `cbm_len`

> +

> +        `cbm_len` is one of the hardware info of CAT.

> +

> +      - Member `cos_max`

> +

> +        `cos_max` is one of the hardware info of CAT.

> +

> +   2. Feature list node

> +

> +      ```

> +      struct feat_node {

> +          enum psr_feat_type feature;

> +          struct feat_ops ops;

> +          struct psr_cat_hw_info info;

> +          uint64_t cos_reg_val[MAX_COS_REG_NUM];

> +          struct list_head list;

> +      };

> +      ```

> +

> +      When a PSR enforcement feature is enabled, it will be added into a

> +      feature list. The head of the list is created in psr initialization.

> +

> +      - Member `feature`

> +

> +        `feature` is an integer number, to indicate which feature the list entry

> +        corresponds to.

> +

> +      - Member `ops`

> +

> +        `ops` maintains a callback function list of the feature. It will be introduced

> +        in details later.

> +

> +      - Member `info`

> +

> +        `info` maintains the feature HW information which can be got through

> +        psr_hwinfo command.

> +

> +      - Member `cos_reg_val`

> +

> +        `cos_reg_val` is an array to maintain the value set in all COS registers of

> +        the feature.

> +

> +   3. Per-socket PSR features information structure

> +

> +      ```

> +      struct psr_cat_socket_info {

> +          unsigned int feat_mask;

> +          unsigned int nr_feat;

> +          struct list_head feat_list;

> +          unsigned int cos_ref[MAX_COS_REG_NUM];

> +          spinlock_t ref_lock;

> +      };

> +      ```

> +

> +      We collect all PSR allocation features information of a socket in

> +      this `struct psr_cat_socket_info`.

> +

> +      - Member `feat_mask`

> +

> +        `feat_mask` is a bitmap, to indicate which feature is enabled on

> +        current socket. We define `feat_mask` bitmap as:

> +

> +        bit 0~1: L3 CAT status, [01] stands for L3 CAT only and [10]

> +                 stands for L3 CDP is enalbed.

> +

> +        bit 2: L2 CAT status.

> +

> +      - Member `cos_ref`

> +

> +        `cos_ref` is an array which maintains the reference of one COS.

> +        If the COS is used by one domain, the reference will increase one.

> +        If a domain releases the COS, the reference will decrease one. The

> +        array is indexed by COS.

> +

> +   4. Feature operation functions structure

> +

> +      ```

> +      struct feat_ops {

> +          void (*init_feature)(unsigned int eax, unsigned int ebx,

> +                               unsigned int ecx, unsigned int edx,

> +                               struct feat_node *feat,

> +                               struct psr_cat_socket_info *info);

> +          int (*get_feat_info)(const struct feat_node *feat, enum cbm_type type,

> +                               uint32_t dat[], uint32_t array_len);

> +          int (*get_val)(const struct feat_node *feat, unsigned int cos,

> +                         enum cbm_type type, uint64_t *val);

> +          unsigned int (*get_max_cos_max)(const struct feat_node *feat);

> +          unsigned int (*get_cos_num)(const struct feat_node *feat);

> +          int (*get_old_val)(uint64_t val[],

> +                             const struct feat_node *feat,

> +                             unsigned int old_cos);

> +          int (*set_new_val)(uint64_t val[],

> +                             const struct feat_node *feat,

> +                             unsigned int old_cos,

> +                             enum cbm_type type,

> +                             uint64_t m);

> +          int (*compare_val)(const uint64_t val[], const struct feat_node *feat,

> +                             unsigned int cos, bool *found);

> +          unsigned int (*get_cos_max_from_type)(const struct feat_node *feat,

> +                                                enum cbm_type type);

> +          unsigned int (*exceeds_cos_max)(const uint64_t val[],

> +                                          const struct feat_node *feat,

> +                                          unsigned int cos);

> +          int (*write_msr)(unsigned int cos, const uint64_t val[],

> +                           struct feat_node *feat);

> +      };

> +      ```

> +

> +      We abstract above callback functions to encapsulate the feature specific

> +      behaviors into them. Then, it is easy to add a new feature. We just need:

> +          1) Implement such ops and callback functions for every feature.

> +          2) Register the ops into `struct feat_node`.

> +          3) Add the feature into feature list during CPU initialization.

> +

> +# Limitations

> +

> +L2 CAT can only work on HW which enables it(check by CPUID). So far, there

> +is no HW enables both L2 CAT and L3 CAT/CDP. But SW implementation has considered

> +such scenario to enable both L2 CAT and L3 CAT/CDP.

> +

> +# Testing

> +

> +L2 CAT uses same xl interfaces as L3 CAT/CDP. So, we can execute these

> +commands to verify L2 CAT and L3 CAT/CDP on different HWs support them.

> +

> +For example:

> +    root@:~$ xl psr-hwinfo --cat

> +    Cache Allocation Technology (CAT): L2

> +    Socket ID       : 0

> +    Maximum COS     : 3

> +    CBM length      : 8

> +    Default CBM     : 0xff

> +

> +    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f

> +

> +    root@:~$ xl psr-cat-show -l2 1

> +    Socket ID       : 0

> +    Default CBM     : 0xff

> +       ID                     NAME             CBM

> +        1                 ubuntu14            0x7f

> +

> +# Areas for improvement

> +

> +N/A

> +

> +# Known issues

> +

> +N/A

> +

> +# References

> +

> +"INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES"

> [Intel® 64 and IA-32 Architectures Software Developer Manuals,

> vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-dev

> eloper-manuals.html)

> +

> +# History

> +

> +------------------------------------------------------------------------

> +Date       Revision Version  Notes

> +---------- -------- -------- -------------------------------------------

> +2016-08-12 1.0      Xen 4.9  Design document written

> +---------- -------- -------- -------------------------------------------

> --

> 1.9.1

> 

> 

> _______________________________________________

> Xen-devel mailing list

> Xen-devel@lists.xen.org

> https://lists.xen.org/xen-devel
Yi Sun Jan. 22, 2017, 2:15 a.m. UTC | #2
On 17-01-20 09:39:41, Tian, Kevin wrote:
> > From: Yi Sun
> > Sent: Thursday, January 19, 2017 2:01 PM
> > 
> > This patch creates L2 CAT feature document in doc/features/.
> > It describes details of L2 CAT.
> 
> A good write-up, but still some improvements required. :-)
> 
Thanks a lot for review and the good suggestions!

> > 
> > Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> > ---
> >  docs/features/intel_psr_l2_cat.pandoc | 347
> > ++++++++++++++++++++++++++++++++++
> >  1 file changed, 347 insertions(+)
> >  create mode 100644 docs/features/intel_psr_l2_cat.pandoc
> > 
> > diff --git a/docs/features/intel_psr_l2_cat.pandoc
> > b/docs/features/intel_psr_l2_cat.pandoc
> > new file mode 100644
> > index 0000000..77bd61f
> > --- /dev/null
> > +++ b/docs/features/intel_psr_l2_cat.pandoc
> > @@ -0,0 +1,347 @@
> > +% Intel L2 Cache Allocation Technology (L2 CAT) Feature
> > +% Revision 1.0
> > +
> > +\clearpage
> > +
> > +# Basics
> > +
> > +---------------- ----------------------------------------------------
> > +         Status: **Tech Preview**
> > +
> > +Architecture(s): Intel x86
> > +
> > +   Component(s): Hypervisor, toolstack
> > +
> > +       Hardware: Atom codename Goldmont and beyond CPUs
> > +---------------- ----------------------------------------------------
> > +
> > +# Overview
> > +
> > +L2 CAT allows an OS or Hypervisor/VMM to control allocation of a
> > +CPU's shared L2 cache based on application priority or Class of Service
> > +(COS). Each CLOS is configured using capacity bitmasks (CBM) which
> > +represent cache capacity and indicate the degree of overlap and
> > +isolation between classes. Once L2 CAT is configured, the processor
> > +allows access to portions of L2 cache according to the established
> > +class of service.
> 
> I would suggest make this doc for all CAT features, otherwise some
> content looks incomplete when you say adding new options or new
> ranges with assumption that reader understands the existing stuff.
> Yes I do notice you provide many background about L3 CAT/CDP. Let's
> just make it complete.
> 
> Also as a permanent design, 'new' is not preferred since it will become
> 'existing' once it's checked in.
> 
We have discussed this before. I proposed to add L3 CAT/CDP feature documents
later. But considering readers may not have background knowledge, I will try
to change this feature document to cover all CAT/CDP features.

> > +
> > +     New option `-l` is added.
> > +     `-l2`: Specify cbm for L2 cache.
> > +     `-l3`: Specify cbm for L3 cache.
> > +
> > +     If neither `-l2` nor `-l3` is given, level 3 is the default option.
> > +
> > +  3. `psr-hwinfo [OPTIONS]`:
> > +
> > +     Show L2 & L3 CAT HW informations on every socket.
> 
> informations -> information
> 
> and which HW information? please elaborate.
> 
> and where is the interface of setting COS for VCPU?
> 
What do you mean 'the interface of setting COS for VCPU'? User can set CBM
for one domain through 'psr-cat-cbm-set' now. Then, hypervisor will find
a matched COS ID or pick an available COS ID for the domain. User does not
need know which COS ID it is using.

> > +When context switch happens, the COS of VCPU is written to per-thread
> > +MSR `IA32_PQR_ASSOC`, and then hardware enforces L2 cache allocation
> > +according to the corresponding CBM.
> > +
> > +## The relationship between L2 CAT and L3 CAT/CDP
> > +
> > +L2 CAT is independent of L3 CAT/CDP, which means L2 CAT would be enabled
> > +while L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are all enabled.
> > +
> > +L2 CAT uses a new range CBMs from 0xD10 ~ 0xD10+n (n<64), following by
> > +the L3 CAT/CDP CBMs, and supports setting different L2 cache accessing
> > +patterns from L3 cache. 
> 
> "L2 CAT supports ..." or "L3 CAT supports..." for the last sentence? 
> 
It is 'L2 CAT'. Sorry for confusion.

> > Like L3 CAT/CDP requirement, the bits of CBM of
> > +L2 CAT must be continuous too.
> > +
> > +N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same
> > +associate register `IA32_PQR_ASSOC`, which means one COS associates to a
> > +pair of L2 CBM and L3 CBM.
> > +
> > +Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or
> > +other PSR features in future). In some cases, a VM is permitted to have a
> > +COS that is beyond one (or more) of PSR features but within the others.
> > +For instance, let's assume the max COS of L2 CAT is 8 but the max COS of
> > +L3 CAT is 16, when a VM is assigned 9 as COS, the L3 CBM associated to
> > +COS 9 would be enforced, but for L2 CAT, the behavior is fully open (no
> > +limit) since COS 9 is beyond the max COS (8) of L2 CAT.
> 
> Does user space need to know such difference since L2 CAT may not 
> be effective for this vCPU?
> .."
> 
In fact, L2 CAT will use default value for such case. This is done by HW.
User space can know the max COS differences through 'psr-hwinfo'.

> > +
> > +* Multi-sockets
> > +
> > +  Different sockets may have different L2 CAT capability (e.g. max COS)
> > +  although it is consistent on the same socket. So the capability of
> > +  per-socket L2 CAT is specified.
> 
> VCPU schedule design is important here. e.g. when migrating a VCPU 
> from socket 1 to socket 2 (socket 1 has max COS as 16 while socket
> 2 as 8), will you prevent migration if VCPU has a COS (9)?
>
'psr-cat-cbm-set' can set CBM for one domain per socket. On each socket, we
maintain a COS array for all domains. One domain uses one COS at one time. One
COS saves the CBM of to work. So, when a VCPU of the domain is migrated from
socket 1 to socket 2, it follows configuration on socket 2. This mechanism has
been implemented since L3 CAT. L2 CAT follows it.

E.g. user sets domain 1 CBM on socket 1 to 0x7f which uses COS 9 but sets
domain 1 CBM on socket 2 to 0x3f which uses COS 7. When VCPU of this domain
is migrated from socket 1 to 2, the COS ID used will be 7, that means 0x3f
will be the CBM to work for this domain.

Thanks,
Sun Yi
Konrad Rzeszutek Wilk Jan. 30, 2017, 6:10 p.m. UTC | #3
On Thu, Jan 19, 2017 at 02:01:03PM +0800, Yi Sun wrote:
> This patch creates L2 CAT feature document in doc/features/.
> It describes details of L2 CAT.

Perhaps also mention what is the title in the Intel SDM 
to look into as well?
Perhaps:
"See Intel Resource Director Technology Monitoring Features"
in the Intel SDM."

> 
> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
> ---
>  docs/features/intel_psr_l2_cat.pandoc | 347 ++++++++++++++++++++++++++++++++++
>  1 file changed, 347 insertions(+)
>  create mode 100644 docs/features/intel_psr_l2_cat.pandoc
> 
> diff --git a/docs/features/intel_psr_l2_cat.pandoc b/docs/features/intel_psr_l2_cat.pandoc
> new file mode 100644
> index 0000000..77bd61f
> --- /dev/null
> +++ b/docs/features/intel_psr_l2_cat.pandoc
> @@ -0,0 +1,347 @@
> +% Intel L2 Cache Allocation Technology (L2 CAT) Feature
> +% Revision 1.0
> +
> +\clearpage
> +
> +# Basics
> +
> +---------------- ----------------------------------------------------
> +         Status: **Tech Preview**
> +
> +Architecture(s): Intel x86
> +
> +   Component(s): Hypervisor, toolstack
> +
> +       Hardware: Atom codename Goldmont and beyond CPUs
> +---------------- ----------------------------------------------------
> +
> +# Overview
> +
> +L2 CAT allows an OS or Hypervisor/VMM to control allocation of a
> +CPU's shared L2 cache based on application priority or Class of Service
> +(COS). Each CLOS is configured using capacity bitmasks (CBM) which
> +represent cache capacity and indicate the degree of overlap and

indicates
> +isolation between classes. Once L2 CAT is configured, the processor
> +allows access to portions of L2 cache according to the established
> +class of service.
> +
> +## Terminology
> +
> +* CAT         Cache Allocation Technology
> +* CBM         Capacity BitMasks
> +* CDP         Code and Data Prioritization
> +* COS/CLOS    Class of Service
> +* MSRs        Machine Specific Registers
> +* PSR         Intel Platform Shared Resource
> +* VMM         Virtual Machine Monitor
> +
> +# User details
> +
> +* Feature Enabling:
> +
> +  Add "psr=cat" to boot line parameter to enable all supported level CAT
> +  features.
> +
> +* xl interfaces:
> +
> +  1. `psr-cat-show [OPTIONS] domain-id`:
> +
> +     Show domain L2 or L3 CAT CBM.
> +
> +     New option `-l` is added.
> +     `-l2`: Show cbm for L2 cache.
> +     `-l3`: Show cbm for L3 cache.
> +
> +     If neither `-l2` nor `-l3` is given, show both of them. If any one
> +     is not supported, will print error info.
> +
> +  2. `psr-cat-cbm-set [OPTIONS] domain-id cbm`:
> +
> +     Set domain L2 or L3 CBM.
> +
> +     New option `-l` is added.
> +     `-l2`: Specify cbm for L2 cache.
> +     `-l3`: Specify cbm for L3 cache.
> +
> +     If neither `-l2` nor `-l3` is given, level 3 is the default option.
> +
> +  3. `psr-hwinfo [OPTIONS]`:
> +
> +     Show L2 & L3 CAT HW informations on every socket.
> +
> +# Technical details
> +
> +L2 CAT is a member of Intel PSR features and part of CAT, it shares
> +some base PSR infrastructure in Xen.
> +
> +## Hardware perspective
> +
> +L2 CAT defines a new range MSRs to assign different L2 cache access
                             ^- 'of'

> +patterns which are known as CBMs, each CBM is associated with a COS.
> +
> +```
> +
> +                        +----------------------------+----------------+
> +   IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |
> + +----+---+-------+     +----------------------------+----------------+
> + |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |
> + +----+---+-------+     +----------------------------+----------------+
> +        └-------------> | ...                        |  ...           |
> +                        +----------------------------+----------------+
> +                        | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |
> +                        +----------------------------+----------------+
> +```
> +
> +When context switch happens, the COS of VCPU is written to per-thread
> +MSR `IA32_PQR_ASSOC`, and then hardware enforces L2 cache allocation
> +according to the corresponding CBM.
> +
> +## The relationship between L2 CAT and L3 CAT/CDP
> +
> +L2 CAT is independent of L3 CAT/CDP, which means L2 CAT would be enabled
> +while L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are all enabled.

s/all/both/

> +
> +L2 CAT uses a new range CBMs from 0xD10 ~ 0xD10+n (n<64), following by

s/by//
> +the L3 CAT/CDP CBMs, and supports setting different L2 cache accessing
> +patterns from L3 cache. Like L3 CAT/CDP requirement, the bits of CBM of
> +L2 CAT must be continuous too.
> +
> +N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same
> +associate register `IA32_PQR_ASSOC`, which means one COS associates to a

s/associates to a/is associated with a/

> +pair of L2 CBM and L3 CBM.
> +
> +Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or
> +other PSR features in future). In some cases, a VM is permitted to have a
> +COS that is beyond one (or more) of PSR features but within the others.
> +For instance, let's assume the max COS of L2 CAT is 8 but the max COS of
> +L3 CAT is 16, when a VM is assigned 9 as COS, the L3 CBM associated to
> +COS 9 would be enforced, but for L2 CAT, the behavior is fully open (no
> +limit) since COS 9 is beyond the max COS (8) of L2 CAT.


Thank you for mentioning the above!

> +
> +## Design Overview
> +
> +* Core COS/CBM association
> +
> +  When enforcing L2 CAT, all cores of domains have the same default
> +  COS (COS0) which associated to the fully open CBM (all ones bitmask)
                     ^-is
> +  to access all L2 cache. The default COS is used only in hypervisor
> +  and is transparent to tool stack and user.
> +
> +  System administrator can change PSR allocation policy at runtime by
> +  tool stack. Since L2 CAT share COS with L3 CAT/CDP, a COS corresponds
> +  to a 2-tuple, like [L2 CBM, L3 CBM] with only-CAT enabled, when CDP
> +  is enabled, one COS corresponds to a 3-tuple, like [L2 CBM,
> +  L3 Code_CBM, L3 Data_CBM]. If neither L3 CAT nor L3 CDP is enabled,
> +  things would be easier, one COS corresponds to one L2 CBM.
> +
> +* VCPU schedule
> +
> +  This part reuses L3 CAT COS infrastructure.
> +
> +* Multi-sockets
> +
> +  Different sockets may have different L2 CAT capability (e.g. max COS)
> +  although it is consistent on the same socket. So the capability of
> +  per-socket L2 CAT is specified.
> +
> +## Implementation Description
> +
> +* Hypervisor interfaces:
> +
> +  1. Boot line parameter "psr=cat" now will enable L2 CAT and L3
> +     CAT if hardware supported.
> +
> +  2. SYSCTL:
> +          - XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.
> +
> +  3. DOMCTL:
> +          - XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.
> +          - XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.
> +
> +* xl interfaces:
> +
> +  1. psr-cat-show -l2 domain-id
> +          Show L2 cbm for a domain.
> +          => XEN_SYSCTL_PSR_CAT_get_l2_info /
> +             XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM
> +
> +  2. psr-mba-set -l2 domain-id cbm
> +          Set L2 cbm for a domain.
> +          => XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM
> +
> +  3. psr-hwinfo
> +          Show PSR HW information, including L2 CAT
> +          => XEN_SYSCTL_PSR_CAT_get_l2_info
> +
> +* Key data structure:
> +
> +   1. Feature HW info
> +
> +      ```
> +      struct psr_cat_hw_info {
> +          unsigned int cbm_len;
> +          unsigned int cos_max;
> +      };
> +      ```
> +
> +      - Member `cbm_len`
> +
> +        `cbm_len` is one of the hardware info of CAT.

Could you expand? Is it the max number of bits you can set? 

> +
> +      - Member `cos_max`
> +
> +        `cos_max` is one of the hardware info of CAT.
> +
> +   2. Feature list node
> +
> +      ```
> +      struct feat_node {
> +          enum psr_feat_type feature;
> +          struct feat_ops ops;
> +          struct psr_cat_hw_info info;
> +          uint64_t cos_reg_val[MAX_COS_REG_NUM];
> +          struct list_head list;
> +      };
> +      ```
> +
> +      When a PSR enforcement feature is enabled, it will be added into a
> +      feature list. The head of the list is created in psr initialization.
> +
> +      - Member `feature`
> +
> +        `feature` is an integer number, to indicate which feature the list entry
> +        corresponds to.
> +
> +      - Member `ops`
> +
> +        `ops` maintains a callback function list of the feature. It will be introduced
> +        in details later.

??? Like by each patch to this document?

> +
> +      - Member `info`
> +
> +        `info` maintains the feature HW information which can be got through

s/which an be got through/which are provided to/ 

Perhaps?
> +        psr_hwinfo command.
> +
> +      - Member `cos_reg_val`
> +
> +        `cos_reg_val` is an array to maintain the value set in all COS registers of
> +        the feature.
> +
> +   3. Per-socket PSR features information structure
> +
> +      ```
> +      struct psr_cat_socket_info {
> +          unsigned int feat_mask;
> +          unsigned int nr_feat;
> +          struct list_head feat_list;
> +          unsigned int cos_ref[MAX_COS_REG_NUM];
> +          spinlock_t ref_lock;
> +      };
> +      ```
> +
> +      We collect all PSR allocation features information of a socket in
> +      this `struct psr_cat_socket_info`.
> +
> +      - Member `feat_mask`
> +
> +        `feat_mask` is a bitmap, to indicate which feature is enabled on
> +        current socket. We define `feat_mask` bitmap as:
> +
> +        bit 0~1: L3 CAT status, [01] stands for L3 CAT only and [10]
> +                 stands for L3 CDP is enalbed.

enabled
> +
> +        bit 2: L2 CAT status.

And the 'nr_feat' is 3 ?

> +
> +      - Member `cos_ref`
> +
> +        `cos_ref` is an array which maintains the reference of one COS.

Is it safe to assume that this maps to 'cos_reg_val' ? If so you may
want to mention that.

> +        If the COS is used by one domain, the reference will increase one.

s/one/by one/

> +        If a domain releases the COS, the reference will decrease one. The

s/decrease one/decrease by one/
> +        array is indexed by COS.
> +
> +   4. Feature operation functions structure
> +
> +      ```
> +      struct feat_ops {
> +          void (*init_feature)(unsigned int eax, unsigned int ebx,
> +                               unsigned int ecx, unsigned int edx,
> +                               struct feat_node *feat,
> +                               struct psr_cat_socket_info *info);
> +          int (*get_feat_info)(const struct feat_node *feat, enum cbm_type type,
> +                               uint32_t dat[], uint32_t array_len);
> +          int (*get_val)(const struct feat_node *feat, unsigned int cos,
> +                         enum cbm_type type, uint64_t *val);
> +          unsigned int (*get_max_cos_max)(const struct feat_node *feat);
> +          unsigned int (*get_cos_num)(const struct feat_node *feat);
> +          int (*get_old_val)(uint64_t val[],
> +                             const struct feat_node *feat,
> +                             unsigned int old_cos);
> +          int (*set_new_val)(uint64_t val[],
> +                             const struct feat_node *feat,
> +                             unsigned int old_cos,
> +                             enum cbm_type type,
> +                             uint64_t m);
> +          int (*compare_val)(const uint64_t val[], const struct feat_node *feat,
> +                             unsigned int cos, bool *found);
> +          unsigned int (*get_cos_max_from_type)(const struct feat_node *feat,
> +                                                enum cbm_type type);
> +          unsigned int (*exceeds_cos_max)(const uint64_t val[],
> +                                          const struct feat_node *feat,
> +                                          unsigned int cos);
> +          int (*write_msr)(unsigned int cos, const uint64_t val[],
> +                           struct feat_node *feat);
> +      };
> +      ```
> +
> +      We abstract above callback functions to encapsulate the feature specific
> +      behaviors into them. Then, it is easy to add a new feature. We just need:
> +          1) Implement such ops and callback functions for every feature.
> +          2) Register the ops into `struct feat_node`.
> +          3) Add the feature into feature list during CPU initialization.
> +
> +# Limitations
> +
> +L2 CAT can only work on HW which enables it(check by CPUID). So far, there
> +is no HW enables both L2 CAT and L3 CAT/CDP. But SW implementation has considered

s/no HW/no HW which/
> +such scenario to enable both L2 CAT and L3 CAT/CDP.
> +
> +# Testing
> +
> +L2 CAT uses same xl interfaces as L3 CAT/CDP. So, we can execute these
> +commands to verify L2 CAT and L3 CAT/CDP on different HWs support them.
> +
> +For example:
> +    root@:~$ xl psr-hwinfo --cat
> +    Cache Allocation Technology (CAT): L2
> +    Socket ID       : 0
> +    Maximum COS     : 3
> +    CBM length      : 8
> +    Default CBM     : 0xff
> +
> +    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
> +
> +    root@:~$ xl psr-cat-show -l2 1
> +    Socket ID       : 0
> +    Default CBM     : 0xff
> +       ID                     NAME             CBM
> +        1                 ubuntu14            0x7f
> +
> +# Areas for improvement
> +
> +N/A
> +
> +# Known issues
> +
> +N/A
> +
> +# References
> +
> +"INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES" [Intel® 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
> +
> +# History
> +
> +------------------------------------------------------------------------
> +Date       Revision Version  Notes
> +---------- -------- -------- -------------------------------------------
> +2016-08-12 1.0      Xen 4.9  Design document written
> +---------- -------- -------- -------------------------------------------
> -- 
> 1.9.1
>
Konrad Rzeszutek Wilk Jan. 30, 2017, 8:39 p.m. UTC | #4
> > +# Testing
> > +
> > +L2 CAT uses same xl interfaces as L3 CAT/CDP. So, we can execute these
> > +commands to verify L2 CAT and L3 CAT/CDP on different HWs support them.
> > +
> > +For example:
> > +    root@:~$ xl psr-hwinfo --cat
> > +    Cache Allocation Technology (CAT): L2
> > +    Socket ID       : 0
> > +    Maximum COS     : 3
> > +    CBM length      : 8
> > +    Default CBM     : 0xff
> > +
> > +    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
> > +
> > +    root@:~$ xl psr-cat-show -l2 1
> > +    Socket ID       : 0
> > +    Default CBM     : 0xff
> > +       ID                     NAME             CBM
> > +        1                 ubuntu14            0x7f
> > +

I tried finding the Intel SDM (December 2016) what the format
of CBM is as the value '0x7f' does not really mean much to me.

Right above Figure 17.28 it says:

"Figure 17-27 also shows three examples of sets of Cache Capacity Bitmasks. For simplicity these are represented
as 8-bit vectors, though this may vary _depending on the implementation and how the mask is mapped to the avail-
able cache capacity._"


So in other words - not documented. 

Then later is says:

" Rather, this is a convenient manner to represent capacity,
overlap and isolation of cache space. For example, executing a POPCNT instruction (population count of set bits) on
the capacity bitmask can provide the fraction of cache space that a class of service can allocate into."

OK, so _can_ (but not _MUST_), so again implementation specific - and
can provide a fraction of cache space.

Which would imply that the values could be:

0x0F - half of L2
0x03 - quarter of L2

Is there some other documentation that explains this in more details?

If it is like I mentioned would it make sense to have 'xl' be capable
of dealing with string values? such as "three-quarters", "half", etc
and then set it? Or percentage and map that to the correct value?

Like:

xl psr-cat-cbm-set -l2 ubuntu14 50%

would be quite obvious instead of say 0x0F?

Thanks.
Yi Sun Feb. 4, 2017, 7:06 a.m. UTC | #5
Thanks a lot for reviewing the patches! Sorry for late to reply. I am on
vacation for Chinese New Year now. :)

On 17-01-30 13:10:33, Konrad Rzeszutek Wilk wrote:
> On Thu, Jan 19, 2017 at 02:01:03PM +0800, Yi Sun wrote:
> > +   2. Feature list node
> > +
> > +      ```
> > +      struct feat_node {
> > +          enum psr_feat_type feature;
> > +          struct feat_ops ops;
> > +          struct psr_cat_hw_info info;
> > +          uint64_t cos_reg_val[MAX_COS_REG_NUM];
> > +          struct list_head list;
> > +      };
> > +      ```
> > +
> > +      When a PSR enforcement feature is enabled, it will be added into a
> > +      feature list. The head of the list is created in psr initialization.
> > +
> > +      - Member `feature`
> > +
> > +        `feature` is an integer number, to indicate which feature the list entry
> > +        corresponds to.
> > +
> > +      - Member `ops`
> > +
> > +        `ops` maintains a callback function list of the feature. It will be introduced
> > +        in details later.
> 
> ??? Like by each patch to this document?
> 
Do you mean 'Like by "which" patch to this document'? Shall I input the patch title which

BRs,
Sun Yi
Yi Sun Feb. 4, 2017, 7:24 a.m. UTC | #6
On 17-01-30 15:39:28, Konrad Rzeszutek Wilk wrote:
> > > +# Testing
> > > +
> > > +L2 CAT uses same xl interfaces as L3 CAT/CDP. So, we can execute these
> > > +commands to verify L2 CAT and L3 CAT/CDP on different HWs support them.
> > > +
> > > +For example:
> > > +    root@:~$ xl psr-hwinfo --cat
> > > +    Cache Allocation Technology (CAT): L2
> > > +    Socket ID       : 0
> > > +    Maximum COS     : 3
> > > +    CBM length      : 8
> > > +    Default CBM     : 0xff
> > > +
> > > +    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
> > > +
> > > +    root@:~$ xl psr-cat-show -l2 1
> > > +    Socket ID       : 0
> > > +    Default CBM     : 0xff
> > > +       ID                     NAME             CBM
> > > +        1                 ubuntu14            0x7f
> > > +
> 
> I tried finding the Intel SDM (December 2016) what the format
> of CBM is as the value '0x7f' does not really mean much to me.
> 
> Right above Figure 17.28 it says:
> 
> "Figure 17-27 also shows three examples of sets of Cache Capacity Bitmasks. For simplicity these are represented
> as 8-bit vectors, though this may vary _depending on the implementation and how the mask is mapped to the avail-
> able cache capacity._"
> 
> 
> So in other words - not documented. 
> 
> Then later is says:
> 
> " Rather, this is a convenient manner to represent capacity,
> overlap and isolation of cache space. For example, executing a POPCNT instruction (population count of set bits) on
> the capacity bitmask can provide the fraction of cache space that a class of service can allocate into."
>
Right after this sentence, the spec mentions "In addition to the fraction, the exact location of the
bits also shows whether the class of service overlaps with other classes of service or is entirely
isolated in terms of cache space used." 

For example:
For domain 1, assign 0xF to it.
For domain 2, assign 0x30 to it.
That means, the cache spaces that domain 1 and domain 2 are using are isolated.

For domain 3, assgin 0x3 to it.
That means, domain 3 cache space overlaps domain 1.

So, we may not just input how much space to be assigned to one domain. :)

> OK, so _can_ (but not _MUST_), so again implementation specific - and
> can provide a fraction of cache space.
> 
> Which would imply that the values could be:
> 
> 0x0F - half of L2
> 0x03 - quarter of L2
> 
> Is there some other documentation that explains this in more details?
> 
> If it is like I mentioned would it make sense to have 'xl' be capable
> of dealing with string values? such as "three-quarters", "half", etc
> and then set it? Or percentage and map that to the correct value?
> 
> Like:
> 
> xl psr-cat-cbm-set -l2 ubuntu14 50%
> 
> would be quite obvious instead of say 0x0F?
> 
> Thanks.
Tian, Kevin Feb. 8, 2017, 6:45 a.m. UTC | #7
> From: Yi Sun [mailto:yi.y.sun@linux.intel.com]

> Sent: Sunday, January 22, 2017 10:15 AM

> 

> > > +

> > > +     New option `-l` is added.

> > > +     `-l2`: Specify cbm for L2 cache.

> > > +     `-l3`: Specify cbm for L3 cache.

> > > +

> > > +     If neither `-l2` nor `-l3` is given, level 3 is the default option.

> > > +

> > > +  3. `psr-hwinfo [OPTIONS]`:

> > > +

> > > +     Show L2 & L3 CAT HW informations on every socket.

> >

> > informations -> information

> >

> > and which HW information? please elaborate.

> >

> > and where is the interface of setting COS for VCPU?

> >

> What do you mean 'the interface of setting COS for VCPU'? User can set CBM

> for one domain through 'psr-cat-cbm-set' now. Then, hypervisor will find

> a matched COS ID or pick an available COS ID for the domain. User does not

> need know which COS ID it is using.


possibly you want to explain above logic in this doc. I didn't see any
discussion of COS ID in the text.

> 

> > > +

> > > +* Multi-sockets

> > > +

> > > +  Different sockets may have different L2 CAT capability (e.g. max COS)

> > > +  although it is consistent on the same socket. So the capability of

> > > +  per-socket L2 CAT is specified.

> >

> > VCPU schedule design is important here. e.g. when migrating a VCPU

> > from socket 1 to socket 2 (socket 1 has max COS as 16 while socket

> > 2 as 8), will you prevent migration if VCPU has a COS (9)?

> >

> 'psr-cat-cbm-set' can set CBM for one domain per socket. On each socket, we

> maintain a COS array for all domains. One domain uses one COS at one time. One

> COS saves the CBM of to work. So, when a VCPU of the domain is migrated from

> socket 1 to socket 2, it follows configuration on socket 2. This mechanism has

> been implemented since L3 CAT. L2 CAT follows it.

> 

> E.g. user sets domain 1 CBM on socket 1 to 0x7f which uses COS 9 but sets

> domain 1 CBM on socket 2 to 0x3f which uses COS 7. When VCPU of this domain

> is migrated from socket 1 to 2, the COS ID used will be 7, that means 0x3f

> will be the CBM to work for this domain.


OK, so user will pick valid CBM on each socket for a specific domain. Also
please include such info in your next version.

Thanks
Kevin
diff mbox

Patch

diff --git a/docs/features/intel_psr_l2_cat.pandoc b/docs/features/intel_psr_l2_cat.pandoc
new file mode 100644
index 0000000..77bd61f
--- /dev/null
+++ b/docs/features/intel_psr_l2_cat.pandoc
@@ -0,0 +1,347 @@ 
+% Intel L2 Cache Allocation Technology (L2 CAT) Feature
+% Revision 1.0
+
+\clearpage
+
+# Basics
+
+---------------- ----------------------------------------------------
+         Status: **Tech Preview**
+
+Architecture(s): Intel x86
+
+   Component(s): Hypervisor, toolstack
+
+       Hardware: Atom codename Goldmont and beyond CPUs
+---------------- ----------------------------------------------------
+
+# Overview
+
+L2 CAT allows an OS or Hypervisor/VMM to control allocation of a
+CPU's shared L2 cache based on application priority or Class of Service
+(COS). Each CLOS is configured using capacity bitmasks (CBM) which
+represent cache capacity and indicate the degree of overlap and
+isolation between classes. Once L2 CAT is configured, the processor
+allows access to portions of L2 cache according to the established
+class of service.
+
+## Terminology
+
+* CAT         Cache Allocation Technology
+* CBM         Capacity BitMasks
+* CDP         Code and Data Prioritization
+* COS/CLOS    Class of Service
+* MSRs        Machine Specific Registers
+* PSR         Intel Platform Shared Resource
+* VMM         Virtual Machine Monitor
+
+# User details
+
+* Feature Enabling:
+
+  Add "psr=cat" to boot line parameter to enable all supported level CAT
+  features.
+
+* xl interfaces:
+
+  1. `psr-cat-show [OPTIONS] domain-id`:
+
+     Show domain L2 or L3 CAT CBM.
+
+     New option `-l` is added.
+     `-l2`: Show cbm for L2 cache.
+     `-l3`: Show cbm for L3 cache.
+
+     If neither `-l2` nor `-l3` is given, show both of them. If any one
+     is not supported, will print error info.
+
+  2. `psr-cat-cbm-set [OPTIONS] domain-id cbm`:
+
+     Set domain L2 or L3 CBM.
+
+     New option `-l` is added.
+     `-l2`: Specify cbm for L2 cache.
+     `-l3`: Specify cbm for L3 cache.
+
+     If neither `-l2` nor `-l3` is given, level 3 is the default option.
+
+  3. `psr-hwinfo [OPTIONS]`:
+
+     Show L2 & L3 CAT HW informations on every socket.
+
+# Technical details
+
+L2 CAT is a member of Intel PSR features and part of CAT, it shares
+some base PSR infrastructure in Xen.
+
+## Hardware perspective
+
+L2 CAT defines a new range MSRs to assign different L2 cache access
+patterns which are known as CBMs, each CBM is associated with a COS.
+
+```
+
+                        +----------------------------+----------------+
+   IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |
+ +----+---+-------+     +----------------------------+----------------+
+ |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |
+ +----+---+-------+     +----------------------------+----------------+
+        └-------------> | ...                        |  ...           |
+                        +----------------------------+----------------+
+                        | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |
+                        +----------------------------+----------------+
+```
+
+When context switch happens, the COS of VCPU is written to per-thread
+MSR `IA32_PQR_ASSOC`, and then hardware enforces L2 cache allocation
+according to the corresponding CBM.
+
+## The relationship between L2 CAT and L3 CAT/CDP
+
+L2 CAT is independent of L3 CAT/CDP, which means L2 CAT would be enabled
+while L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are all enabled.
+
+L2 CAT uses a new range CBMs from 0xD10 ~ 0xD10+n (n<64), following by
+the L3 CAT/CDP CBMs, and supports setting different L2 cache accessing
+patterns from L3 cache. Like L3 CAT/CDP requirement, the bits of CBM of
+L2 CAT must be continuous too.
+
+N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same
+associate register `IA32_PQR_ASSOC`, which means one COS associates to a
+pair of L2 CBM and L3 CBM.
+
+Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or
+other PSR features in future). In some cases, a VM is permitted to have a
+COS that is beyond one (or more) of PSR features but within the others.
+For instance, let's assume the max COS of L2 CAT is 8 but the max COS of
+L3 CAT is 16, when a VM is assigned 9 as COS, the L3 CBM associated to
+COS 9 would be enforced, but for L2 CAT, the behavior is fully open (no
+limit) since COS 9 is beyond the max COS (8) of L2 CAT.
+
+## Design Overview
+
+* Core COS/CBM association
+
+  When enforcing L2 CAT, all cores of domains have the same default
+  COS (COS0) which associated to the fully open CBM (all ones bitmask)
+  to access all L2 cache. The default COS is used only in hypervisor
+  and is transparent to tool stack and user.
+
+  System administrator can change PSR allocation policy at runtime by
+  tool stack. Since L2 CAT share COS with L3 CAT/CDP, a COS corresponds
+  to a 2-tuple, like [L2 CBM, L3 CBM] with only-CAT enabled, when CDP
+  is enabled, one COS corresponds to a 3-tuple, like [L2 CBM,
+  L3 Code_CBM, L3 Data_CBM]. If neither L3 CAT nor L3 CDP is enabled,
+  things would be easier, one COS corresponds to one L2 CBM.
+
+* VCPU schedule
+
+  This part reuses L3 CAT COS infrastructure.
+
+* Multi-sockets
+
+  Different sockets may have different L2 CAT capability (e.g. max COS)
+  although it is consistent on the same socket. So the capability of
+  per-socket L2 CAT is specified.
+
+## Implementation Description
+
+* Hypervisor interfaces:
+
+  1. Boot line parameter "psr=cat" now will enable L2 CAT and L3
+     CAT if hardware supported.
+
+  2. SYSCTL:
+          - XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.
+
+  3. DOMCTL:
+          - XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.
+          - XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.
+
+* xl interfaces:
+
+  1. psr-cat-show -l2 domain-id
+          Show L2 cbm for a domain.
+          => XEN_SYSCTL_PSR_CAT_get_l2_info /
+             XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM
+
+  2. psr-mba-set -l2 domain-id cbm
+          Set L2 cbm for a domain.
+          => XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM
+
+  3. psr-hwinfo
+          Show PSR HW information, including L2 CAT
+          => XEN_SYSCTL_PSR_CAT_get_l2_info
+
+* Key data structure:
+
+   1. Feature HW info
+
+      ```
+      struct psr_cat_hw_info {
+          unsigned int cbm_len;
+          unsigned int cos_max;
+      };
+      ```
+
+      - Member `cbm_len`
+
+        `cbm_len` is one of the hardware info of CAT.
+
+      - Member `cos_max`
+
+        `cos_max` is one of the hardware info of CAT.
+
+   2. Feature list node
+
+      ```
+      struct feat_node {
+          enum psr_feat_type feature;
+          struct feat_ops ops;
+          struct psr_cat_hw_info info;
+          uint64_t cos_reg_val[MAX_COS_REG_NUM];
+          struct list_head list;
+      };
+      ```
+
+      When a PSR enforcement feature is enabled, it will be added into a
+      feature list. The head of the list is created in psr initialization.
+
+      - Member `feature`
+
+        `feature` is an integer number, to indicate which feature the list entry
+        corresponds to.
+
+      - Member `ops`
+
+        `ops` maintains a callback function list of the feature. It will be introduced
+        in details later.
+
+      - Member `info`
+
+        `info` maintains the feature HW information which can be got through
+        psr_hwinfo command.
+
+      - Member `cos_reg_val`
+
+        `cos_reg_val` is an array to maintain the value set in all COS registers of
+        the feature.
+
+   3. Per-socket PSR features information structure
+
+      ```
+      struct psr_cat_socket_info {
+          unsigned int feat_mask;
+          unsigned int nr_feat;
+          struct list_head feat_list;
+          unsigned int cos_ref[MAX_COS_REG_NUM];
+          spinlock_t ref_lock;
+      };
+      ```
+
+      We collect all PSR allocation features information of a socket in
+      this `struct psr_cat_socket_info`.
+
+      - Member `feat_mask`
+
+        `feat_mask` is a bitmap, to indicate which feature is enabled on
+        current socket. We define `feat_mask` bitmap as:
+
+        bit 0~1: L3 CAT status, [01] stands for L3 CAT only and [10]
+                 stands for L3 CDP is enalbed.
+
+        bit 2: L2 CAT status.
+
+      - Member `cos_ref`
+
+        `cos_ref` is an array which maintains the reference of one COS.
+        If the COS is used by one domain, the reference will increase one.
+        If a domain releases the COS, the reference will decrease one. The
+        array is indexed by COS.
+
+   4. Feature operation functions structure
+
+      ```
+      struct feat_ops {
+          void (*init_feature)(unsigned int eax, unsigned int ebx,
+                               unsigned int ecx, unsigned int edx,
+                               struct feat_node *feat,
+                               struct psr_cat_socket_info *info);
+          int (*get_feat_info)(const struct feat_node *feat, enum cbm_type type,
+                               uint32_t dat[], uint32_t array_len);
+          int (*get_val)(const struct feat_node *feat, unsigned int cos,
+                         enum cbm_type type, uint64_t *val);
+          unsigned int (*get_max_cos_max)(const struct feat_node *feat);
+          unsigned int (*get_cos_num)(const struct feat_node *feat);
+          int (*get_old_val)(uint64_t val[],
+                             const struct feat_node *feat,
+                             unsigned int old_cos);
+          int (*set_new_val)(uint64_t val[],
+                             const struct feat_node *feat,
+                             unsigned int old_cos,
+                             enum cbm_type type,
+                             uint64_t m);
+          int (*compare_val)(const uint64_t val[], const struct feat_node *feat,
+                             unsigned int cos, bool *found);
+          unsigned int (*get_cos_max_from_type)(const struct feat_node *feat,
+                                                enum cbm_type type);
+          unsigned int (*exceeds_cos_max)(const uint64_t val[],
+                                          const struct feat_node *feat,
+                                          unsigned int cos);
+          int (*write_msr)(unsigned int cos, const uint64_t val[],
+                           struct feat_node *feat);
+      };
+      ```
+
+      We abstract above callback functions to encapsulate the feature specific
+      behaviors into them. Then, it is easy to add a new feature. We just need:
+          1) Implement such ops and callback functions for every feature.
+          2) Register the ops into `struct feat_node`.
+          3) Add the feature into feature list during CPU initialization.
+
+# Limitations
+
+L2 CAT can only work on HW which enables it(check by CPUID). So far, there
+is no HW enables both L2 CAT and L3 CAT/CDP. But SW implementation has considered
+such scenario to enable both L2 CAT and L3 CAT/CDP.
+
+# Testing
+
+L2 CAT uses same xl interfaces as L3 CAT/CDP. So, we can execute these
+commands to verify L2 CAT and L3 CAT/CDP on different HWs support them.
+
+For example:
+    root@:~$ xl psr-hwinfo --cat
+    Cache Allocation Technology (CAT): L2
+    Socket ID       : 0
+    Maximum COS     : 3
+    CBM length      : 8
+    Default CBM     : 0xff
+
+    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
+
+    root@:~$ xl psr-cat-show -l2 1
+    Socket ID       : 0
+    Default CBM     : 0xff
+       ID                     NAME             CBM
+        1                 ubuntu14            0x7f
+
+# Areas for improvement
+
+N/A
+
+# Known issues
+
+N/A
+
+# References
+
+"INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES" [Intel® 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
+
+# History
+
+------------------------------------------------------------------------
+Date       Revision Version  Notes
+---------- -------- -------- -------------------------------------------
+2016-08-12 1.0      Xen 4.9  Design document written
+---------- -------- -------- -------------------------------------------