diff mbox

[v9,05/25] x86: refactor psr: L3 CAT: implement CPU init and free flow.

Message ID 1489662495-5375-6-git-send-email-yi.y.sun@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yi Sun March 16, 2017, 11:07 a.m. UTC
This patch implements the CPU init and free flow including L3 CAT
initialization and feature array free. It includes below flows:
1. presmp init:
    - parse command line parameter.
    - allocate socket info for every socket.
    - allocate feature resource.
    - initialize socket info, get feature info and add feature into feature
      array per cpuid result.
    - free resources allocated if error happens.
    - register cpu notifier to handle cpu events.
2. cpu notifier:
    - handle cpu online events, if initialization work has been done before,
      do nothing.
    - handle cpu offline events, if it is the last cpu offline, free feature
      resources.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
---
v9:
    - add commit message to explain the flows.
    - handle cpu offline and online again case to read MSRs registers values
      back and save them into cos array to make user can get real data.
    - create a new patch about moving 'cpuid_count_leaf'.
      (suggested by Wei Liu)
    - modify comment to explain why not free some resource in 'free_feature'.
      (suggested by Wei Liu)
    - implement 'psr_alloc_feat_enabled' to check if allocation feature is
      enabled in cmdline and some initialization work done.
      (suggested by Wei Liu)
    - implement 'cat_default_val' to set default value for CAT features.
      (suggested by Wei Liu)
    - replace feature list handling to feature array handling.
      (suggested by Roger Pau)
    - implement a common 'cat_init_feature' to replace L3 CAT/L2 CAT specific
      init functions.
      (suggested by Roger Pau)
    - modify comments for global feature node.
      (suggested by Jan Beulich)
    - remove unnecessary comments.
      (suggested by Jan Beulich)
    - remove unnecessary 'else'.
      (suggested by Jan Beulich)
    - remove 'nr_feat'.
      (suggested by Jan Beulich)
    - modify patch title to indicate 'L3 CAT'.
      (suggested by Jan Beulich)
    - check global flag with boot cpu operations.
      (suggested by Jan Beulich)
    - remove 'cpu_init_work' and move codes into 'psr_cpu_init'.
      (suggested by Jan Beulich)
    - remove 'cpu_fini_work' and move codes into 'psr_cpu_fini'.
      (suggested by Jan Beulich)
    - assign value for 'cos_num'.
      (suggested by Jan Beulich)
    - change about 'uint64_t' to 'uint32_t'.
      (suggested by Jan Beulich)
v8:
    - fix format issue.
      (suggested by Konrad Rzeszutek Wilk)
    - add comments to explain why we care about cpumask_empty when the last
      cpu on socket is offline.
      (suggested by Konrad Rzeszutek Wilk)
v7:
    - initialize structure objects for avoiding surprise.
      (suggested by Konrad Rzeszutek Wilk)
    - fix typo.
      (suggested by Konrad Rzeszutek Wilk)
    - fix a logical mistake when handling the last cpu offline event.
      (suggested by Konrad Rzeszutek Wilk)
v6:
    - use 'struct cpuid_leaf' introduced in Andrew's patch.
      (suggested by Konrad Rzeszutek Wilk)
    - add comments about cpu_add_remove_lock.
      (suggested by Konrad Rzeszutek Wilk)
    - change 'clear_bit' to '__clear_bit'.
      (suggested by Konrad Rzeszutek Wilk)
    - add 'ASSERT' check when setting 'feat_mask'.
      (suggested by Konrad Rzeszutek Wilk)
    - adjust 'printk' position to avoid odd spacing.
      (suggested by Konrad Rzeszutek Wilk)
    - add comment to explain usage of 'feat_l3_cat'.
      (suggested by Konrad Rzeszutek Wilk)
    - fix wording.
      (suggested by Konrad Rzeszutek Wilk)
    - move 'cpuid_count_leaf' helper function to 'asm-x86/processor.h'.
      It cannot be moved to 'cpuid.h' which causes compilation error because
      of header file loop reference.
      (suggested by Andrew Cooper)
v5:
    - add comment to explain the reason to define 'feat_l3_cat'.
      (suggested by Jan Beulich)
    - use 'list_for_each_entry_safe'.
      (suggested by Jan Beulich)
    - remove codes to free 'feat_l3_cat' in 'free_feature' to avoid the need
      for an allocation the next time a CPU comes online.
      (suggested by Jan Beulich)
    - define 'struct cpuid_leaf_regs' to encapsulate eax~edx.
      (suggested by Jan Beulich)
    - print feature info on a socket only when 'opt_cpu_info' is true.
      (suggested by Jan Beulich)
    - declare global variable 'l3_cat_ops' to 'static const'.
      (suggested by Jan Beulich)
    - use 'current_cpu_data'.
      (suggested by Jan Beulich)
    - rename 'feat_tmp' to 'feat'.
      (suggested by Jan Beulich)
    - clear PQE feature bit when the maximum CPUID level is too low.
      (suggested by Jan Beulich)
    - directly call 'l3_cat_init_feature'. No need to make it a callback
      function.
      (suggested by Jan Beulich)
    - remove local variable 'info'.
      (suggested by Jan Beulich)
    - move 'INIT_LIST_HEAD' into 'cpu_init_work' to be together with
      spin_lock_init().
      (suggested by Jan Beulich)
    - remove 'cpu_prepare_work' and move its content into 'psr_cpu_prepare'.
      (suggested by Jan Beulich)
v4:
    - create this patch because of removing all CAT/CDP old codes to make
      implementation be more easily understood.
      (suggested by Jan Beulich)
---
 xen/arch/x86/psr.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 197 insertions(+), 5 deletions(-)

Comments

Jan Beulich March 24, 2017, 4:52 p.m. UTC | #1
>>> On 16.03.17 at 12:07, <yi.y.sun@linux.intel.com> wrote:
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -18,6 +18,7 @@
>  #include <xen/init.h>
>  #include <xen/sched.h>
>  #include <asm/psr.h>
> +#include <asm/x86_emulate.h>

I'm pretty sure you don't need this. If anything you need
processor.h (as that's where the previous patch put
cpuid_count_leaf()), but I'm rather convinced that the header was
already included indirectly at this point.

> @@ -46,6 +50,9 @@
>   */
>  #define MAX_COS_REG_CNT  128
>  
> +/* CAT features use 1 COS register in one access. */
> +#define CAT_COS_NUM      1

With it being stored into the feature node now I don't see why you
need this constant anymore. And indeed it's being used exactly
once.

> @@ -126,11 +133,110 @@ struct psr_assoc {
>  
>  struct psr_cmt *__read_mostly psr_cmt;
>  
> +static struct psr_socket_info *__read_mostly socket_info;
> +
>  static unsigned int opt_psr;
>  static unsigned int __initdata opt_rmid_max = 255;
> +static unsigned int __read_mostly opt_cos_max = MAX_COS_REG_CNT;
>  static uint64_t rmid_mask;
>  static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
>  
> +/*
> + * Declare global feature node for every feature to facilitate the feature
> + * array creation. It is used to transiently store a spare node.
> + */
> +static struct feat_node *feat_l3_cat;
> +
> +/* Common functions */
> +#define cat_default_val(len)                 \
> +            ( (uint32_t)((1ul << len) - 1) )

Pretty odd construct, which I guess you use to avoid the
undefined-ness when len == 32. But this can be had without
extra cast, assuming len is in [1,32]:

#define cat_default_val(len) (0xffffffff >> (32 - (len)))

Also - stray blanks and missing parentheses around the use of macro
parameter.

> +/*
> + * Use this function to check if any allocation feature has been enabled
> + * in cmdline.
> + */
> +static bool psr_alloc_feat_enabled(void)
> +{
> +    return ((!socket_info) ? false : true );

Stray parentheses (all of them actually) and blank. Even more, why
not simply

    return socket_info;

?

> +static void free_feature(struct psr_socket_info *info)
> +{
> +    unsigned int i;
> +
> +    if ( !info )
> +        return;
> +
> +    /*
> +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
> +     * may not be freed here if it is not added into array. It is simply being
> +     * kept until the next CPU online attempt.
> +     */
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +    {
> +        if ( !info->features[i] )
> +            continue;
> +
> +        xfree(info->features[i]);
> +        info->features[i] = NULL;
> +        __clear_bit(i, &info->feat_mask);
> +    }
> +}

What the function does suggests its name ought to be
free_features().

> +static void cat_init_feature(struct cpuid_leaf regs,

I'm sure I've asked before to not pass structures by value. And
once you switch to a pointer, please don't forget to constify it.

> +                             struct feat_node *feat,
> +                             struct psr_socket_info *info,
> +                             enum psr_feat_type type)
> +{
> +    unsigned int socket, i;
> +    struct psr_cat_hw_info cat = { };
> +    uint64_t val;
> +
> +    /* No valid value so do not enable feature. */
> +    if ( !regs.a || !regs.d )
> +        return;
> +
> +    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
> +    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
> +
> +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
> +    /*
> +     * To handle cpu offline and then online case, we need read MSRs back to
> +     * save values into cos_reg_val array.
> +     */
> +    for ( i = 1; i <= cat.cos_max; i++ )
> +    {
> +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
> +        feat->cos_reg_val[i] = (uint32_t)val;
> +    }

You mention this in the changes done, but I don't understand why
you do this. What meaning to these values have to you? If you
want hardware and cached values to match up, the much more
conventional way of enforcing this would be to write the values
you actually want (normally all zero).

> +    feat->info.cat_info = cat;
> +    feat->cos_num = CAT_COS_NUM;
> +
> +    /* Add this feature into array. */
> +    info->features[type] = feat;
> +
> +    ASSERT(!test_bit(type, &info->feat_mask));
> +    __set_bit(type, &info->feat_mask);

    if ( __test_and_set_bit(type, &info->feat_mask) )
        ASSERT_UNREACHABLE();

> +    socket = cpu_to_socket(smp_processor_id());
> +    if ( !opt_cpu_info )
> +        return;
> +
> +    printk(XENLOG_INFO "%s CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> +           ((type == PSR_SOCKET_L3_CAT) ? "L3" : "L2"),
> +           socket, feat->info.cat_info.cos_max,
> +           feat->info.cat_info.cbm_len);
> +
> +    return;

Pointless statement at end of function.

> +/* L3 CAT ops */
> +static const struct feat_ops l3_cat_ops = {
> +};

Leaving an already declared function pointer as NULL? Please don't.

>  static void psr_cpu_init(void)
>  {
> +    struct psr_socket_info *info;
> +    unsigned int socket, i;
> +    unsigned int cpu = smp_processor_id();
> +    struct feat_node *feat;
> +    struct cpuid_leaf regs;
> +
> +    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
> +        goto assoc_init;
> +
> +    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
> +    {
> +        setup_clear_cpu_cap(X86_FEATURE_PQE);
> +        goto assoc_init;
> +    }
> +
> +    socket = cpu_to_socket(cpu);
> +    info = socket_info + socket;
> +    if ( info->feat_mask )
> +        goto assoc_init;
> +
> +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> +        info->features[i] = NULL;

You've xzalloc()ed this memory - why do you need this loop?

> +    spin_lock_init(&info->ref_lock);
> +
> +    cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
> +    if ( regs.b & PSR_RESOURCE_TYPE_L3 )
> +    {
> +        cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, &regs);
> +
> +        feat = feat_l3_cat;
> +        feat_l3_cat = NULL;
> +        feat->ops = l3_cat_ops;
> +
> +        cat_init_feature(regs, feat, info, PSR_SOCKET_L3_CAT);
> +    }
> +
> +assoc_init:

Labels indented by at least on space please.

Jan
Yi Sun March 27, 2017, 4:41 a.m. UTC | #2
On 17-03-24 10:52:34, Jan Beulich wrote:
> >>> On 16.03.17 at 12:07, <yi.y.sun@linux.intel.com> wrote:
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -18,6 +18,7 @@
> >  #include <xen/init.h>
> >  #include <xen/sched.h>
> >  #include <asm/psr.h>
> > +#include <asm/x86_emulate.h>
> 
> I'm pretty sure you don't need this. If anything you need
> processor.h (as that's where the previous patch put
> cpuid_count_leaf()), but I'm rather convinced that the header was
> already included indirectly at this point.
> 
Yes, you are right. It is indirectly included through 'sched.h'.

> > @@ -46,6 +50,9 @@
> >   */
> >  #define MAX_COS_REG_CNT  128
> >  
> > +/* CAT features use 1 COS register in one access. */
> > +#define CAT_COS_NUM      1
> 
> With it being stored into the feature node now I don't see why you
> need this constant anymore. And indeed it's being used exactly
> once.
> 
I remember somebody suggested me not to use constant but should define a
macro. As it is only used once, I will remove this and 'CDP_COS_NUM' in
later patch.

> > +/*
> > + * Declare global feature node for every feature to facilitate the feature
> > + * array creation. It is used to transiently store a spare node.
> > + */
> > +static struct feat_node *feat_l3_cat;
> > +
> > +/* Common functions */
> > +#define cat_default_val(len)                 \
> > +            ( (uint32_t)((1ul << len) - 1) )
> 
> Pretty odd construct, which I guess you use to avoid the
> undefined-ness when len == 32. But this can be had without
> extra cast, assuming len is in [1,32]:
> 
> #define cat_default_val(len) (0xffffffff >> (32 - (len)))
> 
> Also - stray blanks and missing parentheses around the use of macro
> parameter.
> 
Thanks for the suggestion! Will change it.

> > +/*
> > + * Use this function to check if any allocation feature has been enabled
> > + * in cmdline.
> > + */
> > +static bool psr_alloc_feat_enabled(void)
> > +{
> > +    return ((!socket_info) ? false : true );
> 
> Stray parentheses (all of them actually) and blank. Even more, why
> not simply
> 
>     return socket_info;
> 
> ?
> 
How about 'return !!socket_info'?

> > +static void free_feature(struct psr_socket_info *info)
> > +{
> > +    unsigned int i;
> > +
> > +    if ( !info )
> > +        return;
> > +
> > +    /*
> > +     * Free resources of features. The global feature object, e.g. feat_l3_cat,
> > +     * may not be freed here if it is not added into array. It is simply being
> > +     * kept until the next CPU online attempt.
> > +     */
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +    {
> > +        if ( !info->features[i] )
> > +            continue;
> > +
> > +        xfree(info->features[i]);
> > +        info->features[i] = NULL;
> > +        __clear_bit(i, &info->feat_mask);
> > +    }
> > +}
> 
> What the function does suggests its name ought to be
> free_features().
> 
Ok, will modify it.

> > +static void cat_init_feature(struct cpuid_leaf regs,
> 
> I'm sure I've asked before to not pass structures by value. And
> once you switch to a pointer, please don't forget to constify it.
> 
Will correct this, thanks!

> > +                             struct feat_node *feat,
> > +                             struct psr_socket_info *info,
> > +                             enum psr_feat_type type)
> > +{
> > +    unsigned int socket, i;
> > +    struct psr_cat_hw_info cat = { };
> > +    uint64_t val;
> > +
> > +    /* No valid value so do not enable feature. */
> > +    if ( !regs.a || !regs.d )
> > +        return;
> > +
> > +    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
> > +    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
> > +
> > +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> > +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
> > +    /*
> > +     * To handle cpu offline and then online case, we need read MSRs back to
> > +     * save values into cos_reg_val array.
> > +     */
> > +    for ( i = 1; i <= cat.cos_max; i++ )
> > +    {
> > +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
> > +        feat->cos_reg_val[i] = (uint32_t)val;
> > +    }
> 
> You mention this in the changes done, but I don't understand why
> you do this. What meaning to these values have to you? If you
> want hardware and cached values to match up, the much more
> conventional way of enforcing this would be to write the values
> you actually want (normally all zero).
> 
When all cpus on a socket are offline, the free_feature() is called to free
features resources so that the values saved in cos_reg_val[] are lost. When the
socket is online again, features are allocated again so that cos_reg_val[]
members are all initialized to 0. Only is cos_reg_val[0] initialized to default
value in this function in old codes.

But domain is still alive so that its cos id on the socket is kept. The
corresponding MSR value is kept too per test. To make cos_reg_val[] values be
same as HW to not to mislead user, we should read back the valid values on HW
into cos_reg_val[].

> > +    feat->info.cat_info = cat;
> > +    feat->cos_num = CAT_COS_NUM;
> > +
> > +    /* Add this feature into array. */
> > +    info->features[type] = feat;
> > +
> > +    ASSERT(!test_bit(type, &info->feat_mask));
> > +    __set_bit(type, &info->feat_mask);
> 
>     if ( __test_and_set_bit(type, &info->feat_mask) )
>         ASSERT_UNREACHABLE();
> 
Thanks!

> > +    socket = cpu_to_socket(smp_processor_id());
> > +    if ( !opt_cpu_info )
> > +        return;
> > +
> > +    printk(XENLOG_INFO "%s CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> > +           ((type == PSR_SOCKET_L3_CAT) ? "L3" : "L2"),
> > +           socket, feat->info.cat_info.cos_max,
> > +           feat->info.cat_info.cbm_len);
> > +
> > +    return;
> 
> Pointless statement at end of function.
> 
Will remove it.

> > +/* L3 CAT ops */
> > +static const struct feat_ops l3_cat_ops = {
> > +};
> 
> Leaving an already declared function pointer as NULL? Please don't.
> 
Ok, will consider to move it and below code into later patch.
    feat->ops = l3_cat_ops;

> >  static void psr_cpu_init(void)
> >  {
> > +    struct psr_socket_info *info;
> > +    unsigned int socket, i;
> > +    unsigned int cpu = smp_processor_id();
> > +    struct feat_node *feat;
> > +    struct cpuid_leaf regs;
> > +
> > +    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
> > +        goto assoc_init;
> > +
> > +    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
> > +    {
> > +        setup_clear_cpu_cap(X86_FEATURE_PQE);
> > +        goto assoc_init;
> > +    }
> > +
> > +    socket = cpu_to_socket(cpu);
> > +    info = socket_info + socket;
> > +    if ( info->feat_mask )
> > +        goto assoc_init;
> > +
> > +    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
> > +        info->features[i] = NULL;
> 
> You've xzalloc()ed this memory - why do you need this loop?
> 
Hmm, no need indeed. Will remove this.

> > +    spin_lock_init(&info->ref_lock);
> > +
> > +    cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
> > +    if ( regs.b & PSR_RESOURCE_TYPE_L3 )
> > +    {
> > +        cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, &regs);
> > +
> > +        feat = feat_l3_cat;
> > +        feat_l3_cat = NULL;
> > +        feat->ops = l3_cat_ops;
> > +
> > +        cat_init_feature(regs, feat, info, PSR_SOCKET_L3_CAT);
> > +    }
> > +
> > +assoc_init:
> 
> Labels indented by at least on space please.
> 
Got it, thanks!

> Jan
Jan Beulich March 27, 2017, 6:34 a.m. UTC | #3
>>> On 27.03.17 at 06:41, <yi.y.sun@linux.intel.com> wrote:
> On 17-03-24 10:52:34, Jan Beulich wrote:
>> >>> On 16.03.17 at 12:07, <yi.y.sun@linux.intel.com> wrote:
>> > @@ -46,6 +50,9 @@
>> >   */
>> >  #define MAX_COS_REG_CNT  128
>> >  
>> > +/* CAT features use 1 COS register in one access. */
>> > +#define CAT_COS_NUM      1
>> 
>> With it being stored into the feature node now I don't see why you
>> need this constant anymore. And indeed it's being used exactly
>> once.
>> 
> I remember somebody suggested me not to use constant but should define a
> macro. As it is only used once, I will remove this and 'CDP_COS_NUM' in
> later patch.

It may well have been me, back when this was used in multiple places.

>> > +/*
>> > + * Use this function to check if any allocation feature has been enabled
>> > + * in cmdline.
>> > + */
>> > +static bool psr_alloc_feat_enabled(void)
>> > +{
>> > +    return ((!socket_info) ? false : true );
>> 
>> Stray parentheses (all of them actually) and blank. Even more, why
>> not simply
>> 
>>     return socket_info;
>> 
>> ?
>> 
> How about 'return !!socket_info'?

And what would the !! be good for? Back when we were still using
bool_t that would have been a requirement (the code wouldn't
even have built without afaict), but now that we use bool I don't
see the point (other that cluttering code). In fact I consider the
presence of the function questionable as a whole, unless later
patches add to it.

>> > +                             struct feat_node *feat,
>> > +                             struct psr_socket_info *info,
>> > +                             enum psr_feat_type type)
>> > +{
>> > +    unsigned int socket, i;
>> > +    struct psr_cat_hw_info cat = { };
>> > +    uint64_t val;
>> > +
>> > +    /* No valid value so do not enable feature. */
>> > +    if ( !regs.a || !regs.d )
>> > +        return;
>> > +
>> > +    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
>> > +    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
>> > +
>> > +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> > +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
>> > +    /*
>> > +     * To handle cpu offline and then online case, we need read MSRs back to
>> > +     * save values into cos_reg_val array.
>> > +     */
>> > +    for ( i = 1; i <= cat.cos_max; i++ )
>> > +    {
>> > +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
>> > +        feat->cos_reg_val[i] = (uint32_t)val;
>> > +    }
>> 
>> You mention this in the changes done, but I don't understand why
>> you do this. What meaning to these values have to you? If you
>> want hardware and cached values to match up, the much more
>> conventional way of enforcing this would be to write the values
>> you actually want (normally all zero).
>> 
> When all cpus on a socket are offline, the free_feature() is called to free
> features resources so that the values saved in cos_reg_val[] are lost. When the
> socket is online again, features are allocated again so that cos_reg_val[]
> members are all initialized to 0. Only is cos_reg_val[0] initialized to default
> value in this function in old codes.
> 
> But domain is still alive so that its cos id on the socket is kept. The
> corresponding MSR value is kept too per test. To make cos_reg_val[] values be
> same as HW to not to mislead user, we should read back the valid values on HW
> into cos_reg_val[].

Okay, I understand the background, but I don't view this solution
as viable: Once the last core on a socket goes offline, all
references to it should be cleaned up. After all what will be
brought back online may be a different physical CPU altogether;
you can't assume MSR values to have survived even if it is the
same CPU which comes back online, as it may have undergone
a reset cycle, or BIOS/SMM may have played with the MSRs.
That's even a possibility for a single core coming back online, so
you have to reload MSRs explicitly anyway if implicit reloading
(i.e. once vCPU-s get scheduled onto it) doesn't suffice.

>> > +/* L3 CAT ops */
>> > +static const struct feat_ops l3_cat_ops = {
>> > +};
>> 
>> Leaving an already declared function pointer as NULL? Please don't.
>> 
> Ok, will consider to move it and below code into later patch.
>     feat->ops = l3_cat_ops;

I don't mind the empty structure instance above, as long as the
structure doesn't have any function pointer members yet (data
members are almost always fine).

Jan
Yi Sun March 27, 2017, 8:16 a.m. UTC | #4
On 17-03-27 00:34:29, Jan Beulich wrote:
> >>> On 27.03.17 at 06:41, <yi.y.sun@linux.intel.com> wrote:
> > On 17-03-24 10:52:34, Jan Beulich wrote:
> >> >>> On 16.03.17 at 12:07, <yi.y.sun@linux.intel.com> wrote:
> >> > @@ -46,6 +50,9 @@
> >> >   */
> >> >  #define MAX_COS_REG_CNT  128
> >> >  
> >> > +/* CAT features use 1 COS register in one access. */
> >> > +#define CAT_COS_NUM      1
> >> 
> >> With it being stored into the feature node now I don't see why you
> >> need this constant anymore. And indeed it's being used exactly
> >> once.
> >> 
> > I remember somebody suggested me not to use constant but should define a
> > macro. As it is only used once, I will remove this and 'CDP_COS_NUM' in
> > later patch.
> 
> It may well have been me, back when this was used in multiple places.
> 
Ok, I got it. Will remove such macros.

> >> > +/*
> >> > + * Use this function to check if any allocation feature has been enabled
> >> > + * in cmdline.
> >> > + */
> >> > +static bool psr_alloc_feat_enabled(void)
> >> > +{
> >> > +    return ((!socket_info) ? false : true );
> >> 
> >> Stray parentheses (all of them actually) and blank. Even more, why
> >> not simply
> >> 
> >>     return socket_info;
> >> 
> >> ?
> >> 
> > How about 'return !!socket_info'?
> 
> And what would the !! be good for? Back when we were still using
> bool_t that would have been a requirement (the code wouldn't
> even have built without afaict), but now that we use bool I don't
> see the point (other that cluttering code). In fact I consider the
> presence of the function questionable as a whole, unless later
> patches add to it.
> 
Per Wei's suggestion, I added this function to make readers clearly understand
the meaning of the code. In previous codes, we just check 'if ( !socket_info )'.

Per test, 'return socket_info' causes warning if function type is 'bool'.

> >> > +                             struct feat_node *feat,
> >> > +                             struct psr_socket_info *info,
> >> > +                             enum psr_feat_type type)
> >> > +{
> >> > +    unsigned int socket, i;
> >> > +    struct psr_cat_hw_info cat = { };
> >> > +    uint64_t val;
> >> > +
> >> > +    /* No valid value so do not enable feature. */
> >> > +    if ( !regs.a || !regs.d )
> >> > +        return;
> >> > +
> >> > +    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
> >> > +    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
> >> > +
> >> > +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
> >> > +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
> >> > +    /*
> >> > +     * To handle cpu offline and then online case, we need read MSRs back to
> >> > +     * save values into cos_reg_val array.
> >> > +     */
> >> > +    for ( i = 1; i <= cat.cos_max; i++ )
> >> > +    {
> >> > +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
> >> > +        feat->cos_reg_val[i] = (uint32_t)val;
> >> > +    }
> >> 
> >> You mention this in the changes done, but I don't understand why
> >> you do this. What meaning to these values have to you? If you
> >> want hardware and cached values to match up, the much more
> >> conventional way of enforcing this would be to write the values
> >> you actually want (normally all zero).
> >> 
> > When all cpus on a socket are offline, the free_feature() is called to free
> > features resources so that the values saved in cos_reg_val[] are lost. When the
> > socket is online again, features are allocated again so that cos_reg_val[]
> > members are all initialized to 0. Only is cos_reg_val[0] initialized to default
> > value in this function in old codes.
> > 
> > But domain is still alive so that its cos id on the socket is kept. The
> > corresponding MSR value is kept too per test. To make cos_reg_val[] values be
> > same as HW to not to mislead user, we should read back the valid values on HW
> > into cos_reg_val[].
> 
> Okay, I understand the background, but I don't view this solution
> as viable: Once the last core on a socket goes offline, all
> references to it should be cleaned up. After all what will be
> brought back online may be a different physical CPU altogether;
> you can't assume MSR values to have survived even if it is the
> same CPU which comes back online, as it may have undergone
> a reset cycle, or BIOS/SMM may have played with the MSRs.
> That's even a possibility for a single core coming back online, so
> you have to reload MSRs explicitly anyway if implicit reloading
> (i.e. once vCPU-s get scheduled onto it) doesn't suffice.
> 
So, you think the MSRs values may not be valid after such process and
reloading (write MSRs to default value) is needed. If so, I would like
to do more operations in 'free_feature()':
1. Iterate all domains working on the offline socket to change
   'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
   status.
2. Restore 'socket_info[socket].cos_ref[]' to all 0.

These can make the socket's info be totally restored back to init status.

How do you think? Thanks!

> >> > +/* L3 CAT ops */
> >> > +static const struct feat_ops l3_cat_ops = {
> >> > +};
> >> 
> >> Leaving an already declared function pointer as NULL? Please don't.
> >> 
> > Ok, will consider to move it and below code into later patch.
> >     feat->ops = l3_cat_ops;
> 
> I don't mind the empty structure instance above, as long as the
> structure doesn't have any function pointer members yet (data
> members are almost always fine).
> 
To explain how the data structures are, I declared '(*get_cos_max)' in
'struct feat_ops' in patch 3. So, do you mind I remove this declaration
and just keep an empty 'struct feat_ops' in patch 3 so that we can keep
current codes in this patch?

> Jan
Jan Beulich March 27, 2017, 8:43 a.m. UTC | #5
>>> On 27.03.17 at 10:16, <yi.y.sun@linux.intel.com> wrote:
> On 17-03-27 00:34:29, Jan Beulich wrote:
>> >>> On 27.03.17 at 06:41, <yi.y.sun@linux.intel.com> wrote:
>> > On 17-03-24 10:52:34, Jan Beulich wrote:
>> >> >>> On 16.03.17 at 12:07, <yi.y.sun@linux.intel.com> wrote:
>> >> > @@ -46,6 +50,9 @@
>> >> >   */
>> >> >  #define MAX_COS_REG_CNT  128
>> >> >  
>> >> > +/* CAT features use 1 COS register in one access. */
>> >> > +#define CAT_COS_NUM      1
>> >> 
>> >> With it being stored into the feature node now I don't see why you
>> >> need this constant anymore. And indeed it's being used exactly
>> >> once.
>> >> 
>> > I remember somebody suggested me not to use constant but should define a
>> > macro. As it is only used once, I will remove this and 'CDP_COS_NUM' in
>> > later patch.
>> 
>> It may well have been me, back when this was used in multiple places.
>> 
> Ok, I got it. Will remove such macros.
> 
>> >> > +/*
>> >> > + * Use this function to check if any allocation feature has been enabled
>> >> > + * in cmdline.
>> >> > + */
>> >> > +static bool psr_alloc_feat_enabled(void)
>> >> > +{
>> >> > +    return ((!socket_info) ? false : true );
>> >> 
>> >> Stray parentheses (all of them actually) and blank. Even more, why
>> >> not simply
>> >> 
>> >>     return socket_info;
>> >> 
>> >> ?
>> >> 
>> > How about 'return !!socket_info'?
>> 
>> And what would the !! be good for? Back when we were still using
>> bool_t that would have been a requirement (the code wouldn't
>> even have built without afaict), but now that we use bool I don't
>> see the point (other that cluttering code). In fact I consider the
>> presence of the function questionable as a whole, unless later
>> patches add to it.
>> 
> Per Wei's suggestion, I added this function to make readers clearly 
> understand
> the meaning of the code. In previous codes, we just check 'if ( !socket_info )'.
> 
> Per test, 'return socket_info' causes warning if function type is 'bool'.

Oh, that is unfortunate (and then indeed requires to use !!).
I would have expected that conversion here works just like in
if(), where no !! would be needed.

>> >> > +                             struct feat_node *feat,
>> >> > +                             struct psr_socket_info *info,
>> >> > +                             enum psr_feat_type type)
>> >> > +{
>> >> > +    unsigned int socket, i;
>> >> > +    struct psr_cat_hw_info cat = { };
>> >> > +    uint64_t val;
>> >> > +
>> >> > +    /* No valid value so do not enable feature. */
>> >> > +    if ( !regs.a || !regs.d )
>> >> > +        return;
>> >> > +
>> >> > +    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
>> >> > +    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
>> >> > +
>> >> > +    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
>> >> > +    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
>> >> > +    /*
>> >> > +     * To handle cpu offline and then online case, we need read MSRs back to
>> >> > +     * save values into cos_reg_val array.
>> >> > +     */
>> >> > +    for ( i = 1; i <= cat.cos_max; i++ )
>> >> > +    {
>> >> > +        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
>> >> > +        feat->cos_reg_val[i] = (uint32_t)val;
>> >> > +    }
>> >> 
>> >> You mention this in the changes done, but I don't understand why
>> >> you do this. What meaning to these values have to you? If you
>> >> want hardware and cached values to match up, the much more
>> >> conventional way of enforcing this would be to write the values
>> >> you actually want (normally all zero).
>> >> 
>> > When all cpus on a socket are offline, the free_feature() is called to free
>> > features resources so that the values saved in cos_reg_val[] are lost. When the
>> > socket is online again, features are allocated again so that cos_reg_val[]
>> > members are all initialized to 0. Only is cos_reg_val[0] initialized to default
>> > value in this function in old codes.
>> > 
>> > But domain is still alive so that its cos id on the socket is kept. The
>> > corresponding MSR value is kept too per test. To make cos_reg_val[] values be
>> > same as HW to not to mislead user, we should read back the valid values on HW
>> > into cos_reg_val[].
>> 
>> Okay, I understand the background, but I don't view this solution
>> as viable: Once the last core on a socket goes offline, all
>> references to it should be cleaned up. After all what will be
>> brought back online may be a different physical CPU altogether;
>> you can't assume MSR values to have survived even if it is the
>> same CPU which comes back online, as it may have undergone
>> a reset cycle, or BIOS/SMM may have played with the MSRs.
>> That's even a possibility for a single core coming back online, so
>> you have to reload MSRs explicitly anyway if implicit reloading
>> (i.e. once vCPU-s get scheduled onto it) doesn't suffice.
>> 
> So, you think the MSRs values may not be valid after such process and
> reloading (write MSRs to default value) is needed. If so, I would like
> to do more operations in 'free_feature()':
> 1. Iterate all domains working on the offline socket to change
>    'd->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init
>    status.
> 2. Restore 'socket_info[socket].cos_ref[]' to all 0.
> 
> These can make the socket's info be totally restored back to init status.

Yes, that's what I think is needed.

>> >> > +/* L3 CAT ops */
>> >> > +static const struct feat_ops l3_cat_ops = {
>> >> > +};
>> >> 
>> >> Leaving an already declared function pointer as NULL? Please don't.
>> >> 
>> > Ok, will consider to move it and below code into later patch.
>> >     feat->ops = l3_cat_ops;
>> 
>> I don't mind the empty structure instance above, as long as the
>> structure doesn't have any function pointer members yet (data
>> members are almost always fine).
>> 
> To explain how the data structures are, I declared '(*get_cos_max)' in
> 'struct feat_ops' in patch 3. So, do you mind I remove this declaration
> and just keep an empty 'struct feat_ops' in patch 3 so that we can keep
> current codes in this patch?

As said, I have no problem with the structure remaining empty
until subsequent patches start filling it. No need to re-structure
several patches.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 822f1c0..66a9ce8 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -18,6 +18,7 @@ 
 #include <xen/init.h>
 #include <xen/sched.h>
 #include <asm/psr.h>
+#include <asm/x86_emulate.h>
 
 /*
  * Terminology:
@@ -34,6 +35,9 @@ 
 #define PSR_CAT        (1<<1)
 #define PSR_CDP        (1<<2)
 
+#define CAT_CBM_LEN_MASK 0x1f
+#define CAT_COS_MAX_MASK 0xffff
+
 /*
  * Per SDM chapter 'Cache Allocation Technology: Cache Mask Configuration',
  * the MSRs ranging from 0C90H through 0D0FH (inclusive), enables support for
@@ -46,6 +50,9 @@ 
  */
 #define MAX_COS_REG_CNT  128
 
+/* CAT features use 1 COS register in one access. */
+#define CAT_COS_NUM      1
+
 enum psr_feat_type {
     PSR_SOCKET_L3_CAT = 0,
     PSR_SOCKET_L3_CDP,
@@ -126,11 +133,110 @@  struct psr_assoc {
 
 struct psr_cmt *__read_mostly psr_cmt;
 
+static struct psr_socket_info *__read_mostly socket_info;
+
 static unsigned int opt_psr;
 static unsigned int __initdata opt_rmid_max = 255;
+static unsigned int __read_mostly opt_cos_max = MAX_COS_REG_CNT;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+/*
+ * Declare global feature node for every feature to facilitate the feature
+ * array creation. It is used to transiently store a spare node.
+ */
+static struct feat_node *feat_l3_cat;
+
+/* Common functions */
+#define cat_default_val(len)                 \
+            ( (uint32_t)((1ul << len) - 1) )
+
+/*
+ * Use this function to check if any allocation feature has been enabled
+ * in cmdline.
+ */
+static bool psr_alloc_feat_enabled(void)
+{
+    return ((!socket_info) ? false : true );
+}
+
+static void free_feature(struct psr_socket_info *info)
+{
+    unsigned int i;
+
+    if ( !info )
+        return;
+
+    /*
+     * Free resources of features. The global feature object, e.g. feat_l3_cat,
+     * may not be freed here if it is not added into array. It is simply being
+     * kept until the next CPU online attempt.
+     */
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+    {
+        if ( !info->features[i] )
+            continue;
+
+        xfree(info->features[i]);
+        info->features[i] = NULL;
+        __clear_bit(i, &info->feat_mask);
+    }
+}
+
+/* CAT common functions implementation. */
+static void cat_init_feature(struct cpuid_leaf regs,
+                             struct feat_node *feat,
+                             struct psr_socket_info *info,
+                             enum psr_feat_type type)
+{
+    unsigned int socket, i;
+    struct psr_cat_hw_info cat = { };
+    uint64_t val;
+
+    /* No valid value so do not enable feature. */
+    if ( !regs.a || !regs.d )
+        return;
+
+    cat.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
+    cat.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK);
+
+    /* cos=0 is reserved as default cbm(all bits within cbm_len are 1). */
+    feat->cos_reg_val[0] = cat_default_val(cat.cbm_len);
+    /*
+     * To handle cpu offline and then online case, we need read MSRs back to
+     * save values into cos_reg_val array.
+     */
+    for ( i = 1; i <= cat.cos_max; i++ )
+    {
+        rdmsrl(MSR_IA32_PSR_L3_MASK(i), val);
+        feat->cos_reg_val[i] = (uint32_t)val;
+    }
+
+    feat->info.cat_info = cat;
+    feat->cos_num = CAT_COS_NUM;
+
+    /* Add this feature into array. */
+    info->features[type] = feat;
+
+    ASSERT(!test_bit(type, &info->feat_mask));
+    __set_bit(type, &info->feat_mask);
+
+    socket = cpu_to_socket(smp_processor_id());
+    if ( !opt_cpu_info )
+        return;
+
+    printk(XENLOG_INFO "%s CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
+           ((type == PSR_SOCKET_L3_CAT) ? "L3" : "L2"),
+           socket, feat->info.cat_info.cos_max,
+           feat->info.cat_info.cbm_len);
+
+    return;
+}
+
+/* L3 CAT ops */
+static const struct feat_ops l3_cat_ops = {
+};
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
                                   unsigned int mask)
 {
@@ -170,6 +276,9 @@  static void __init parse_psr_param(char *s)
         if ( val_str && !strcmp(s, "rmid_max") )
             opt_rmid_max = simple_strtoul(val_str, NULL, 0);
 
+        if ( val_str && !strcmp(s, "cos_max") )
+            opt_cos_max = simple_strtoul(val_str, NULL, 0);
+
         s = ss + 1;
     } while ( ss );
 }
@@ -325,19 +434,98 @@  void psr_domain_free(struct domain *d)
     psr_free_rmid(d);
 }
 
-static int psr_cpu_prepare(unsigned int cpu)
+static void __init init_psr(void)
 {
+    if ( opt_cos_max < 1 )
+    {
+        printk(XENLOG_INFO "CAT: disabled, cos_max is too small\n");
+        return;
+    }
+
+    socket_info = xzalloc_array(struct psr_socket_info, nr_sockets);
+
+    if ( !socket_info )
+    {
+        printk(XENLOG_INFO "Failed to alloc socket_info!\n");
+        return;
+    }
+}
+
+static void __init psr_free(void)
+{
+    xfree(socket_info);
+    socket_info = NULL;
+}
+
+static int psr_cpu_prepare(void)
+{
+    if ( !psr_alloc_feat_enabled() )
+        return 0;
+
+    /* Malloc memory for the global feature head here. */
+    if ( feat_l3_cat == NULL &&
+         (feat_l3_cat = xzalloc(struct feat_node)) == NULL )
+        return -ENOMEM;
+
     return 0;
 }
 
 static void psr_cpu_init(void)
 {
+    struct psr_socket_info *info;
+    unsigned int socket, i;
+    unsigned int cpu = smp_processor_id();
+    struct feat_node *feat;
+    struct cpuid_leaf regs;
+
+    if ( !psr_alloc_feat_enabled() || !boot_cpu_has(X86_FEATURE_PQE) )
+        goto assoc_init;
+
+    if ( boot_cpu_data.cpuid_level < PSR_CPUID_LEVEL_CAT )
+    {
+        setup_clear_cpu_cap(X86_FEATURE_PQE);
+        goto assoc_init;
+    }
+
+    socket = cpu_to_socket(cpu);
+    info = socket_info + socket;
+    if ( info->feat_mask )
+        goto assoc_init;
+
+    for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ )
+        info->features[i] = NULL;
+
+    spin_lock_init(&info->ref_lock);
+
+    cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 0, &regs);
+    if ( regs.b & PSR_RESOURCE_TYPE_L3 )
+    {
+        cpuid_count_leaf(PSR_CPUID_LEVEL_CAT, 1, &regs);
+
+        feat = feat_l3_cat;
+        feat_l3_cat = NULL;
+        feat->ops = l3_cat_ops;
+
+        cat_init_feature(regs, feat, info, PSR_SOCKET_L3_CAT);
+    }
+
+assoc_init:
     psr_assoc_init();
 }
 
 static void psr_cpu_fini(unsigned int cpu)
 {
-    return;
+    unsigned int socket = cpu_to_socket(cpu);
+
+    if ( !psr_alloc_feat_enabled() )
+        return;
+
+    /*
+     * We only free when we are the last CPU in the socket. The socket_cpumask
+     * is cleared prior to this notification code by remove_siblinginfo().
+     */
+    if ( socket_cpumask[socket] && cpumask_empty(socket_cpumask[socket]) )
+        free_feature(socket_info + socket);
 }
 
 static int cpu_callback(
@@ -349,7 +537,7 @@  static int cpu_callback(
     switch ( action )
     {
     case CPU_UP_PREPARE:
-        rc = psr_cpu_prepare(cpu);
+        rc = psr_cpu_prepare();
         break;
     case CPU_STARTING:
         psr_cpu_init();
@@ -378,10 +566,14 @@  static int __init psr_presmp_init(void)
     if ( (opt_psr & PSR_CMT) && opt_rmid_max )
         init_psr_cmt(opt_rmid_max);
 
-    psr_cpu_prepare(0);
+    if ( opt_psr & PSR_CAT )
+        init_psr();
+
+    if ( psr_cpu_prepare() )
+        psr_free();
 
     psr_cpu_init();
-    if ( psr_cmt_enabled() )
+    if ( psr_cmt_enabled() || psr_alloc_feat_enabled() )
         register_cpu_notifier(&cpu_nfb);
 
     return 0;