mbox series

[0/7] Rationalize usage of xc_domain_getinfo{,list}()

Message ID 20230426145932.3340-1-alejandro.vallejo@cloud.com (mailing list archive)
Headers show
Series Rationalize usage of xc_domain_getinfo{,list}() | expand

Message

Alejandro Vallejo April 26, 2023, 2:59 p.m. UTC
xc_domain_getinfo() returns the list of domains with domid >= first_domid.
It does so by repeatedly invoking XEN_DOMCTL_getdomaininfo, which leads to
unintuitive behaviour (asking for domid=1 might succeed returning domid=2).
Furthermore, N hypercalls are required whereas the equivalent functionality
can be achieved using with XEN_SYSCTL_getdomaininfo.

Ideally, we want a DOMCTL interface that operates over a single precisely
specified domain and a SYSCTL interface that can be used for bulk queries.

All callers of xc_domain_getinfo() that are better off using SYSCTL are
migrated to use that instead. That includes callers performing domain
discovery and those requesting info for more than 1 domain per hypercall.

A new xc_domain_getinfo_single() is introduced with stricter semantics than
xc_domain_getinfo() (failing if domid isn't found) to migrate the rest to.

With no callers left the xc_dominfo_t structure and the xc_domain_getinfo()
call itself can be cleanly removed, and the DOMCTL interface simplified to
only use its fastpath.

With the DOMCTL ammended, the new xc_domain_getinfo_single() drops its
stricter check, becoming a simple wrapper to invoke the hypercall itself.

Alejandro Vallejo (7):
  tools: Make some callers of xc_domain_getinfo use
    xc_domain_getinfolist
  tools: Create xc_domain_getinfo_single()
  tools: Refactor the console/io.c to avoid using xc_domain_getinfo()
  tools: Make init-xenstore-domain use xc_domain_getinfolist()
  tools: Modify single-domid callers of xc_domain_getinfolist
  tools: Use new xc function for some xc_domain_getinfo calls
  domctl: Modify getdomaininfo to fail if domid is not found

 tools/console/client/main.c             |  7 +--
 tools/console/daemon/io.c               | 31 +++++-----
 tools/debugger/kdd/kdd-xen.c            |  6 +-
 tools/helpers/init-xenstore-domain.c    | 14 +++--
 tools/include/xenctrl.h                 | 63 ++++++++------------
 tools/libs/ctrl/xc_domain.c             | 79 +++++--------------------
 tools/libs/ctrl/xc_pagetab.c            |  7 +--
 tools/libs/ctrl/xc_private.c            |  7 +--
 tools/libs/ctrl/xc_private.h            |  6 +-
 tools/libs/guest/xg_core.c              | 21 +++----
 tools/libs/guest/xg_core.h              |  6 +-
 tools/libs/guest/xg_core_arm.c          | 10 ++--
 tools/libs/guest/xg_core_x86.c          | 18 +++---
 tools/libs/guest/xg_cpuid_x86.c         | 28 +++++----
 tools/libs/guest/xg_dom_boot.c          | 12 +---
 tools/libs/guest/xg_domain.c            |  6 +-
 tools/libs/guest/xg_offline_page.c      | 10 ++--
 tools/libs/guest/xg_private.h           |  1 +
 tools/libs/guest/xg_resume.c            | 17 +++---
 tools/libs/guest/xg_sr_common.h         |  2 +-
 tools/libs/guest/xg_sr_restore.c        | 14 ++---
 tools/libs/guest/xg_sr_restore_x86_pv.c |  2 +-
 tools/libs/guest/xg_sr_save.c           | 26 ++++----
 tools/libs/guest/xg_sr_save_x86_pv.c    |  6 +-
 tools/libs/light/libxl_dom.c            | 15 ++---
 tools/libs/light/libxl_dom_suspend.c    |  7 +--
 tools/libs/light/libxl_domain.c         | 12 ++--
 tools/libs/light/libxl_mem.c            |  4 +-
 tools/libs/light/libxl_sched.c          | 28 ++++-----
 tools/libs/light/libxl_x86_acpi.c       |  4 +-
 tools/misc/xen-hvmcrash.c               |  6 +-
 tools/misc/xen-lowmemd.c                |  6 +-
 tools/misc/xen-mfndump.c                | 22 +++----
 tools/misc/xen-vmtrace.c                |  6 +-
 tools/python/xen/lowlevel/xc/xc.c       | 29 ++++-----
 tools/vchan/vchan-socket-proxy.c        |  6 +-
 tools/xenmon/xenbaked.c                 |  6 +-
 tools/xenpaging/xenpaging.c             | 14 ++---
 tools/xenstore/xenstored_domain.c       | 15 +++--
 tools/xentrace/xenctx.c                 |  8 +--
 xen/common/domctl.c                     | 32 +---------
 41 files changed, 245 insertions(+), 374 deletions(-)

Comments

Andrew Cooper April 26, 2023, 10:30 p.m. UTC | #1
On 26/04/2023 3:59 pm, Alejandro Vallejo wrote:
> xc_domain_getinfo() returns the list of domains with domid >= first_domid.
> It does so by repeatedly invoking XEN_DOMCTL_getdomaininfo, which leads to
> unintuitive behaviour (asking for domid=1 might succeed returning domid=2).
> Furthermore, N hypercalls are required whereas the equivalent functionality
> can be achieved using with XEN_SYSCTL_getdomaininfo.
>
> Ideally, we want a DOMCTL interface that operates over a single precisely
> specified domain and a SYSCTL interface that can be used for bulk queries.
>
> All callers of xc_domain_getinfo() that are better off using SYSCTL are
> migrated to use that instead. That includes callers performing domain
> discovery and those requesting info for more than 1 domain per hypercall.
>
> A new xc_domain_getinfo_single() is introduced with stricter semantics than
> xc_domain_getinfo() (failing if domid isn't found) to migrate the rest to.
>
> With no callers left the xc_dominfo_t structure and the xc_domain_getinfo()
> call itself can be cleanly removed, and the DOMCTL interface simplified to
> only use its fastpath.
>
> With the DOMCTL ammended, the new xc_domain_getinfo_single() drops its
> stricter check, becoming a simple wrapper to invoke the hypercall itself.
>
> Alejandro Vallejo (7):
>   tools: Make some callers of xc_domain_getinfo use
>     xc_domain_getinfolist
>   tools: Create xc_domain_getinfo_single()
>   tools: Refactor the console/io.c to avoid using xc_domain_getinfo()
>   tools: Make init-xenstore-domain use xc_domain_getinfolist()
>   tools: Modify single-domid callers of xc_domain_getinfolist
>   tools: Use new xc function for some xc_domain_getinfo calls
>   domctl: Modify getdomaininfo to fail if domid is not found

The patchew run found one single failure,

https://gitlab.com/xen-project/patchew/xen/-/jobs/4183881202

This part looks reasonably fatal:

 * Starting local ... *   Executing "/etc/local.d/xen.start" ...Starting
/usr/local/sbin/xenstored...
/etc/xen/scripts/launch-xenstore: line 90: echo: write error: Invalid
argument
Setting domain 0 name, domid and JSON config...
Done setting up Dom0
Starting xenconsoled...

except it was only the part trying to set the OOM score after starting
xenstored, and the only way that plausibly fails is if the pidfile was
bad.  And then the other print messages are clearly out of order.

I've rerun the pipeline a second time,
https://gitlab.com/xen-project/patchew/xen/-/pipelines/850230144, to see
if gitlab thinks it is a reliable or unreliable failure.


But, there's a plenty of other stuff in this log which is concerning. 
Stefano, Michal:

 * Starting networking ...awk: /etc/network/interfaces: No such file or
directory
 * ERROR: networking failed to start

The domains ought to have a interfaces file with "auto eth0", or even
nothing.  Alpine clearly isn't expecting the absence of the file at
all.  The fact the ping test passes usually means that this error isn't
as fatal as it makes out.

Next,

 * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh
/etc/init.d/modloop start
 * Mounting modloop  ... [ !! ]
 * ERROR: modloop failed to start

Not sure what modloop is, but this doesn't look healthy.

Next,

 * Loading modules ... *   Processing /etc/modules
modprobe: can't change directory to '/lib/modules': No such file or
directory

This probably just wants an empty dir in the filesystem.

I could go on, but I wont.  One thing that we do need however is
/var/log/xen/* pulled into the artefacts of the job, because if there
really is a real xenstored problem hiding in this series, there's no way
to debug it with the current artefacts collected.

~Andrew
Andrew Cooper April 27, 2023, 9:26 a.m. UTC | #2
On 26/04/2023 11:30 pm, Andrew Cooper wrote:
> I've rerun the pipeline a second time,
> https://gitlab.com/xen-project/patchew/xen/-/pipelines/850230144, to see
> if gitlab thinks it is a reliable or unreliable failure.

It passed the second time, so I'm pretty confident this is a buggy test
with a rare race condition, rather than a bug in the series under test.

~Andrew