Message ID | 20191010135004.24226-1-al1img@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] Reset iomem's gfn to LIBXL_INVALID_GFN on reboot | expand |
Oleksandr Grytsov writes ("[PATCH v1] Reset iomem's gfn to LIBXL_INVALID_GFN on reboot"): > During domain reboot its configuration is partially reused > to re-create a new domain, but iomem's GFN field for the > iomem is only restored for those memory ranges, which are > configured in form of [IOMEM_START,NUM_PAGES[@GFN], but not for > those in form of [IOMEM_START,NUM_PAGES], e.g. without GFN. > For the latter GFN is reset to 0, but while mapping ranges > to a domain during reboot there is a check that GFN treated > as valid if it is not equal to LIBXL_INVALID_GFN, thus making > Xen to map IOMEM_START to address 0 in the guest's address space. > > Workaround it by resseting GFN to LIBXL_INVALID_GFN, so xl > can set proper values for mapping on reboot. Thanks for this patch. I confess that I am not sure what is going on here. Where is this troublesome 0 coming from ? I see that the default value for gfn in the struct is 0 and looking for assignments before this patch, gfn is defaulted from b_info->iomem[i].start, which is presumably non-0. I suspect that your patch may be fixing this the wrong way. I have addressed this mail to various people who have touched this area of code and hope they will be able to clarify. BTW, please do ping this (and your other patches) by email, if the conversation seems to stall. Thanks, Ian. > Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > --- > tools/libxl/libxl_domain.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c > index 9d0eb5aed1..0ae16a5b12 100644 > --- a/tools/libxl/libxl_domain.c > +++ b/tools/libxl/libxl_domain.c > @@ -2120,6 +2120,15 @@ static void retrieve_domain_configuration_end(libxl__egc *egc, > } > } > > + /* reset IOMEM's GFN to initial value */ > + { > + int i; > + > + for (i = 0; i < d_config->b_info.num_iomem; i++) > + if (d_config->b_info.iomem[i].gfn == 0) > + d_config->b_info.iomem[i].gfn = LIBXL_INVALID_GFN; > + } > + > /* Devices: disk, nic, vtpm, pcidev etc. */ > > /* The MERGE macro implements following logic: > -- > 2.17.1 >
Hi Ian, On 11/10/2019 16:14, Ian Jackson wrote: > Oleksandr Grytsov writes ("[PATCH v1] Reset iomem's gfn to LIBXL_INVALID_GFN on reboot"): >> During domain reboot its configuration is partially reused >> to re-create a new domain, but iomem's GFN field for the >> iomem is only restored for those memory ranges, which are >> configured in form of [IOMEM_START,NUM_PAGES[@GFN], but not for >> those in form of [IOMEM_START,NUM_PAGES], e.g. without GFN. >> For the latter GFN is reset to 0, but while mapping ranges >> to a domain during reboot there is a check that GFN treated >> as valid if it is not equal to LIBXL_INVALID_GFN, thus making >> Xen to map IOMEM_START to address 0 in the guest's address space. >> >> Workaround it by resseting GFN to LIBXL_INVALID_GFN, so xl >> can set proper values for mapping on reboot. > > Thanks for this patch. > > I confess that I am not sure what is going on here. Where is this > troublesome 0 coming from ? I see that the default value for gfn in > the struct is 0 and looking for assignments before this patch, gfn is > defaulted from b_info->iomem[i].start, which is presumably non-0. > > I suspect that your patch may be fixing this the wrong way. I have > addressed this mail to various people who have touched this area of > code and hope they will be able to clarify. I found a thread from December 2017 related to the problem described here [1]. Looking at the thread, there were no conclusion on the root causes and some questions were left unanswered by the contributor (see [2]). Oleksandr, could you look at the thread and see if you can provide more details what's going on? Answering back here would be fine. > > BTW, please do ping this (and your other patches) by email, if the > conversation seems to stall. > > Thanks, > Ian. > >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> >> --- >> tools/libxl/libxl_domain.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c >> index 9d0eb5aed1..0ae16a5b12 100644 >> --- a/tools/libxl/libxl_domain.c >> +++ b/tools/libxl/libxl_domain.c >> @@ -2120,6 +2120,15 @@ static void retrieve_domain_configuration_end(libxl__egc *egc, >> } >> } >> >> + /* reset IOMEM's GFN to initial value */ >> + { >> + int i; >> + >> + for (i = 0; i < d_config->b_info.num_iomem; i++) >> + if (d_config->b_info.iomem[i].gfn == 0) >> + d_config->b_info.iomem[i].gfn = LIBXL_INVALID_GFN; >> + } >> + >> /* Devices: disk, nic, vtpm, pcidev etc. */ >> >> /* The MERGE macro implements following logic: >> -- >> 2.17.1 >> Cheers, [1] <ebf78aec-dcfd-72d9-dac2-06b29e4a66ae@gmail.com> [2] <20180213122432.h4fh22ej4dfe7226@citrix.com>
On Mon, Oct 14, 2019 at 12:28 PM Julien Grall <julien.grall@arm.com> wrote: > > Hi Ian, > > On 11/10/2019 16:14, Ian Jackson wrote: > > Oleksandr Grytsov writes ("[PATCH v1] Reset iomem's gfn to LIBXL_INVALID_GFN on reboot"): > >> During domain reboot its configuration is partially reused > >> to re-create a new domain, but iomem's GFN field for the > >> iomem is only restored for those memory ranges, which are > >> configured in form of [IOMEM_START,NUM_PAGES[@GFN], but not for > >> those in form of [IOMEM_START,NUM_PAGES], e.g. without GFN. > >> For the latter GFN is reset to 0, but while mapping ranges > >> to a domain during reboot there is a check that GFN treated > >> as valid if it is not equal to LIBXL_INVALID_GFN, thus making > >> Xen to map IOMEM_START to address 0 in the guest's address space. > >> > >> Workaround it by resseting GFN to LIBXL_INVALID_GFN, so xl > >> can set proper values for mapping on reboot. > > > > Thanks for this patch. > > > > I confess that I am not sure what is going on here. Where is this > > troublesome 0 coming from ? I see that the default value for gfn in > > the struct is 0 and looking for assignments before this patch, gfn is > > defaulted from b_info->iomem[i].start, which is presumably non-0. > > > > I suspect that your patch may be fixing this the wrong way. I have > > addressed this mail to various people who have touched this area of > > code and hope they will be able to clarify. > > I found a thread from December 2017 related to the problem described here [1]. > > Looking at the thread, there were no conclusion on the root causes and some > questions were left unanswered by the contributor (see [2]). > > Oleksandr, could you look at the thread and see if you can provide more details > what's going on? Answering back here would be fine. > > > > > BTW, please do ping this (and your other patches) by email, if the > > conversation seems to stall. > > > > Thanks, > > Ian. > > > >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> > >> --- > >> tools/libxl/libxl_domain.c | 9 +++++++++ > >> 1 file changed, 9 insertions(+) > >> > >> diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c > >> index 9d0eb5aed1..0ae16a5b12 100644 > >> --- a/tools/libxl/libxl_domain.c > >> +++ b/tools/libxl/libxl_domain.c > >> @@ -2120,6 +2120,15 @@ static void retrieve_domain_configuration_end(libxl__egc *egc, > >> } > >> } > >> > >> + /* reset IOMEM's GFN to initial value */ > >> + { > >> + int i; > >> + > >> + for (i = 0; i < d_config->b_info.num_iomem; i++) > >> + if (d_config->b_info.iomem[i].gfn == 0) > >> + d_config->b_info.iomem[i].gfn = LIBXL_INVALID_GFN; > >> + } > >> + > >> /* Devices: disk, nic, vtpm, pcidev etc. */ > >> > >> /* The MERGE macro implements following logic: > >> -- > >> 2.17.1 > >> > > Cheers, > > [1] <ebf78aec-dcfd-72d9-dac2-06b29e4a66ae@gmail.com> > [2] <20180213122432.h4fh22ej4dfe7226@citrix.com> > > -- > Julien Grall Julien, Thanks to point me out for this old thread. I completely forgot about it (I haven't worked with xen since long time). I've performed additional investigation and found the root cause of the issue. It doesn't relate to iomem GFN directly. The problem is in type from json parsing at place where libxl creates array of struct. For example, libxl_domain_config_from_json calls libxl_domain_config_init which initializes all child structures and arrays. But then when libxl parses json and creates the array of structure, it doesn't initialize array elements properly (see libxl__domain_build_info_parse_json iomem parsing): p->num_iomem = x->u.array->count; p->iomem = libxl__calloc(NOGC, p->num_iomem, sizeof(*p->iomem)); if (!p->iomem && p->num_iomem != 0) { rc = -1; goto out; } for (i=0; (t=libxl__json_array_get(x,i)); i++) { rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); if (rc) goto out; } libxl creates array element with calloc function, so all element fields are initialized with zero values. Even some of them have default value different from zero. For these purpose dedicated init function should be called for each element. Above example should be: for (i=0; (t=libxl__json_array_get(x,i)); i++) { libxl_iomem_range_init(&p->iomem[i]); rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); if (rc) goto out; } I've changes gentypes.py as following: diff --git a/tools/libxl/gentypes.py b/tools/libxl/gentypes.py index 88e5c5f30e..92e28be469 100644 --- a/tools/libxl/gentypes.py +++ b/tools/libxl/gentypes.py @@ -454,6 +454,8 @@ def libxl_C_type_parse_json(ty, w, v, indent = " ", parent = None, discrimina s += " goto out;\n" s += " }\n" s += " for (i=0; (t=libxl__json_array_get(x,i)); i++) {\n" + if ty.elem_type.init_fn is not None and ty.elem_type.autogenerate_init_fn: + s += indent + " "+"%s_init(&%s[i]);\n" % (ty.elem_type.typename, v) s += libxl_C_type_parse_json(ty.elem_type, "t", v+"[i]", indent + " ", parent) s += " }\n" I'm not sure is it right and complete fix. Ian, could you review? If the fix is ok, I will submit the patch. Thanks.
Hi Oleksandr, Apologies for the late answer. On 16/10/2019 14:09, Oleksandr Grytsov wrote: > On Mon, Oct 14, 2019 at 12:28 PM Julien Grall <julien.grall@arm.com> wrote: > Thanks to point me out for this old thread. I completely forgot about it > (I haven't worked with xen since long time). I've performed additional > investigation > and found the root cause of the issue. It doesn't relate to iomem GFN directly. > The problem is in type from json parsing at place where libxl creates array of > struct. > > For example, libxl_domain_config_from_json calls libxl_domain_config_init > which initializes all child structures and arrays. But then when libxl parses > json and creates the array of structure, it doesn't initialize array elements > properly (see libxl__domain_build_info_parse_json iomem parsing): > > p->num_iomem = x->u.array->count; > p->iomem = libxl__calloc(NOGC, p->num_iomem, sizeof(*p->iomem)); > if (!p->iomem && p->num_iomem != 0) { > rc = -1; > goto out; > } > for (i=0; (t=libxl__json_array_get(x,i)); i++) { > rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); > if (rc) > goto out; > } > > libxl creates array element with calloc function, so all element > fields are initialized > with zero values. Even some of them have default value different from zero. > For these purpose dedicated init function should be called for each element. > Above example should be: > > for (i=0; (t=libxl__json_array_get(x,i)); i++) { > libxl_iomem_range_init(&p->iomem[i]); > rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); > if (rc) > goto out; > } Not initializing the values is fine as long as the JSON describes all the fields of the structure. The key point here is the GFN is not described in the JSON (see libxl_iomem_range_gen_json) if it is equal to LIBXL_INVALID_GFN. As the field is not described, it will be defaulted to 0. > > I've changes gentypes.py as following: > > diff --git a/tools/libxl/gentypes.py b/tools/libxl/gentypes.py > index 88e5c5f30e..92e28be469 100644 > --- a/tools/libxl/gentypes.py > +++ b/tools/libxl/gentypes.py > @@ -454,6 +454,8 @@ def libxl_C_type_parse_json(ty, w, v, indent = " > ", parent = None, discrimina > s += " goto out;\n" > s += " }\n" > s += " for (i=0; (t=libxl__json_array_get(x,i)); i++) {\n" > + if ty.elem_type.init_fn is not None and > ty.elem_type.autogenerate_init_fn: My knowledge of libxl is quite limited. But I don't think this is correct, you want to call init_fn whether this has been autogenerated or not. > + s += indent + " "+"%s_init(&%s[i]);\n" % > (ty.elem_type.typename, v) Looking at the other usage (like _libxl_C_type_init), init_fn is called with s += "%s(%s);\n" % (ty.init_fn, ty.pass_arg(v, parent is None)) I am also not entirely sure whether we should also cater the ty.init_val != None as well here. > s += libxl_C_type_parse_json(ty.elem_type, "t", v+"[i]", > indent + " ", parent) > s += " }\n" > > I'm not sure is it right and complete fix. > > Ian, could you review? > > If the fix is ok, I will submit the patch. IHMO, the idea is there. The code may need some modifications (see above). Please post a patch so we can go forward in the process to review it. Cheers,
On Tue, Oct 22, 2019 at 9:14 PM Julien Grall <julien.grall@arm.com> wrote: > > Hi Oleksandr, > > Apologies for the late answer. > > On 16/10/2019 14:09, Oleksandr Grytsov wrote: > > On Mon, Oct 14, 2019 at 12:28 PM Julien Grall <julien.grall@arm.com> wrote: > > Thanks to point me out for this old thread. I completely forgot about it > > (I haven't worked with xen since long time). I've performed additional > > investigation > > and found the root cause of the issue. It doesn't relate to iomem GFN directly. > > The problem is in type from json parsing at place where libxl creates array of > > struct. > > > > For example, libxl_domain_config_from_json calls libxl_domain_config_init > > which initializes all child structures and arrays. But then when libxl parses > > json and creates the array of structure, it doesn't initialize array elements > > properly (see libxl__domain_build_info_parse_json iomem parsing): > > > > p->num_iomem = x->u.array->count; > > p->iomem = libxl__calloc(NOGC, p->num_iomem, sizeof(*p->iomem)); > > if (!p->iomem && p->num_iomem != 0) { > > rc = -1; > > goto out; > > } > > for (i=0; (t=libxl__json_array_get(x,i)); i++) { > > rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); > > if (rc) > > goto out; > > } > > > > libxl creates array element with calloc function, so all element > > fields are initialized > > with zero values. Even some of them have default value different from zero. > > For these purpose dedicated init function should be called for each element. > > Above example should be: > > > > for (i=0; (t=libxl__json_array_get(x,i)); i++) { > > libxl_iomem_range_init(&p->iomem[i]); > > rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]); > > if (rc) > > goto out; > > } > > Not initializing the values is fine as long as the JSON describes all the fields > of the structure. > > The key point here is the GFN is not described in the JSON (see > libxl_iomem_range_gen_json) if it is equal to LIBXL_INVALID_GFN. As the field is > not described, it will be defaulted to 0. > Yes. So, either ..._gen_json should generate entries for default values or ..._parse_json should set missing entries to its default values. Second solution looks more correct. > > > > I've changes gentypes.py as following: > > > > diff --git a/tools/libxl/gentypes.py b/tools/libxl/gentypes.py > > index 88e5c5f30e..92e28be469 100644 > > --- a/tools/libxl/gentypes.py > > +++ b/tools/libxl/gentypes.py > > @@ -454,6 +454,8 @@ def libxl_C_type_parse_json(ty, w, v, indent = " > > ", parent = None, discrimina > > s += " goto out;\n" > > s += " }\n" > > s += " for (i=0; (t=libxl__json_array_get(x,i)); i++) {\n" > > + if ty.elem_type.init_fn is not None and > > ty.elem_type.autogenerate_init_fn: > > My knowledge of libxl is quite limited. But I don't think this is correct, you > want to call init_fn whether this has been autogenerated or not. > > > + s += indent + " "+"%s_init(&%s[i]);\n" % > > (ty.elem_type.typename, v) > > Looking at the other usage (like _libxl_C_type_init), init_fn is called with > > s += "%s(%s);\n" % (ty.init_fn, ty.pass_arg(v, parent is None)) > > I am also not entirely sure whether we should also cater the ty.init_val != None > as well here. > > > s += libxl_C_type_parse_json(ty.elem_type, "t", v+"[i]", > > indent + " ", parent) > > s += " }\n" > > > > I'm not sure is it right and complete fix. > > > > Ian, could you review? > > > > If the fix is ok, I will submit the patch. > > IHMO, the idea is there. The code may need some modifications (see above). > Please post a patch so we can go forward in the process to review it. Thanks. I will post the separate path. > > Cheers, > > -- > Julien Grall
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c index 9d0eb5aed1..0ae16a5b12 100644 --- a/tools/libxl/libxl_domain.c +++ b/tools/libxl/libxl_domain.c @@ -2120,6 +2120,15 @@ static void retrieve_domain_configuration_end(libxl__egc *egc, } } + /* reset IOMEM's GFN to initial value */ + { + int i; + + for (i = 0; i < d_config->b_info.num_iomem; i++) + if (d_config->b_info.iomem[i].gfn == 0) + d_config->b_info.iomem[i].gfn = LIBXL_INVALID_GFN; + } + /* Devices: disk, nic, vtpm, pcidev etc. */ /* The MERGE macro implements following logic: