Message ID | cover.1667324874.git.edvin.torok@citrix.com (mailing list archive) |
---|---|
Headers | show |
Series | xenctrl.ml: improve scalability of domain_getinfolist | expand |
> On 1 Nov 2022, at 17:59, Edwin Török <edvin.torok@citrix.com> wrote: > > > Edwin Török (2): > xenctrl.ml: make domain_getinfolist tail recursive > xenctrl: use larger chunksize in domain_getinfolist > > tools/ocaml/libs/xc/xenctrl.ml | 25 ++++++++++++++++++------- > 1 file changed, 18 insertions(+), 7 deletions(-) Acked-by: Christian Lindig <christian.lindig@citrix.com> > It was calling the Xen domainfolist hypercall N/2 times. > Optimize this such that it is called at most 2 times during normal use. > > Implement a tail recursive `rev_concat` equivalent to `concat |> rev`, > and use it instead of calling `@` multiple times. Are there any assurances about the order in elements returned by domain_getinfolist? I understand that the change maintains the current behaviour but are we even required to maintain that order? Because otherwise we could return the reverse list and save more work. — C
> On 2 Nov 2022, at 09:11, Christian Lindig <christian.lindig@citrix.com> wrote: > > > >> On 1 Nov 2022, at 17:59, Edwin Török <edvin.torok@citrix.com> wrote: >> >> >> Edwin Török (2): >> xenctrl.ml: make domain_getinfolist tail recursive >> xenctrl: use larger chunksize in domain_getinfolist >> >> tools/ocaml/libs/xc/xenctrl.ml | 25 ++++++++++++++++++------- >> 1 file changed, 18 insertions(+), 7 deletions(-) > > Acked-by: Christian Lindig <christian.lindig@citrix.com> > > >> It was calling the Xen domainfolist hypercall N/2 times. >> Optimize this such that it is called at most 2 times during normal use. >> >> Implement a tail recursive `rev_concat` equivalent to `concat |> rev`, >> and use it instead of calling `@` multiple times. > > Are there any assurances about the order in elements returned by domain_getinfolist? I understand that the change maintains the current behaviour but are we even required to maintain that order? Because otherwise we could return the reverse list and save more work. Maintaining the current (reversed) order in the C stubs is useful to be able to extract the largest domid so we know where to continue. However we don't necessarily have to maintain the order on the higher level function available in the .mli, it just happens that `rev_concat` gives us the results in the order that we want. Although perhaps using an array rather a list here might be better, and we could move resizing/reallocing that array into the C stub, and have just a nicer (and perhaps faster) interface in the C/OCaml API, by producing less garbage than lists (a single Array.t allocation rather than N list elements). (Although I don't usually like pushing complexity like that into C stubs, for now I'm happy with the performance of the code as is). There are more issues in these C stubs in xenctrl, I'll be pouring over the code and try to send a few more patches. Best regards, --Edwin
> On 2 Nov 2022, at 09:11, Christian Lindig <christian.lindig@citrix.com> wrote: > > > >> On 1 Nov 2022, at 17:59, Edwin Török <edvin.torok@citrix.com> wrote: >> >> >> Edwin Török (2): >> xenctrl.ml: make domain_getinfolist tail recursive >> xenctrl: use larger chunksize in domain_getinfolist >> >> tools/ocaml/libs/xc/xenctrl.ml | 25 ++++++++++++++++++------- >> 1 file changed, 18 insertions(+), 7 deletions(-) > > Acked-by: Christian Lindig <christian.lindig@citrix.com> > > >> It was calling the Xen domainfolist hypercall N/2 times. >> Optimize this such that it is called at most 2 times during normal use. >> >> Implement a tail recursive `rev_concat` equivalent to `concat |> rev`, >> and use it instead of calling `@` multiple times. > > Are there any assurances about the order in elements returned by domain_getinfolist? I understand that the change maintains the current behaviour but are we even required to maintain that order? Because otherwise we could return the reverse list and save more work. After some discussion with Andrew Cooper apparently the xenctrl API is broken and cannot be used safely as is, so I'll be rewriting this patch. Domids can be assigned randomly, or toolstacks can set a custom domid, so it is not guaranteed that new domids always show up as higher numbers, and the only safe way to use infolist is to either request 1 domid, or request them all (domids are 15-bit, so 32768). Anything else is prone to race conditions and might give you an inconsistent snapshot. This is probably better fixed by changing the prototype of the C function in xenctrl to not take a count to force updating all callers (the callers in the 'xl' toolstack are just as buggy as the one in this OCaml binding, and probably better to fix the API in xenctrl than working around the race condition in each caller). Stay tuned for more patches... Best regards, --Edwin
Hi, To be clear, what is presented here are clear improvements in the status quo, and qualify for inclusion on their own merits. And definitely should be considered. That said, this is a swamp with future problems, and one rather fundamental one in Xen which I is not going to be easy to fix. 1) (simple), there are a bunch of stubs, including stub_xc_domain_getinfo() which don't use caml_{enter,leave}_blocking_section() when they should. 2) stub_xc_domain_getinfo() reuses xc_domain_getinfolist() meaning that it uses XEN_SYSCTL_getdomaininfolist rather than XEN_DOMCTL_getdomaininfo, which is a problem because... 3) While both of these hypercalls have crazy APIs leading to loads of broken callers, at least the DOMCTL has a fastpath for when you specify a valid domid. The SYSCTL (and DOMCTL slowpath) is an O(N) search of all domains starting from d0 to find the first domain with an id >= the input request. The DOMCTL slowpath is useless. Not a single caller (I've ever encountered) wants that behaviour, whereas I've needed to bugfix caller which didn't have an "&& info.domid == domid" to have one, to get the semantics they wanted. Cleaning this up will be a good improvement. 4) The (adjusted) algorithm in patch 1 still loops until it gets a result with no entries. Meaning that it's still going to make one hypercall too many, searching the entire domlist, to return no data. In principle you could optimise to stop at any hypercall which returns fewer than the requested number of domains, except... 5) ... if you ever use more than a single hypercall, Xen has dropped and re-acquired the domlist read lock, meaning that you can't use the result anyway. Causality couldn't be broken when domains were allocated sequentially, but we have a random allocation mode now so you can observe things out of order. The only safe action is to ask for all 32k domains in a single go, and use a single hypercall. ~Andrew