Message ID | 20240404140833.1557953-1-ross.lagerwall@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen-hvm: Avoid livelock while handling buffered ioreqs | expand |
On 04/04/2024 15:08, Ross Lagerwall wrote: > A malicious or buggy guest may generated buffered ioreqs faster than > QEMU can process them in handle_buffered_iopage(). The result is a > livelock - QEMU continuously processes ioreqs on the main thread without > iterating through the main loop which prevents handling other events, > processing timers, etc. Without QEMU handling other events, it often > results in the guest becoming unsable and makes it difficult to stop the > source of buffered ioreqs. > > To avoid this, if we process a full page of buffered ioreqs, stop and > reschedule an immediate timer to continue processing them. This lets > QEMU go back to the main loop and catch up. > Do PV backends potentially cause the same scheduling issue (if not using io threads)? > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > --- > hw/xen/xen-hvm-common.c | 26 +++++++++++++++++--------- > 1 file changed, 17 insertions(+), 9 deletions(-) > Reviewed-by: Paul Durrant <paul@xen.org>
On Sat, Apr 6, 2024 at 11:58 AM Durrant, Paul <xadimgnik@gmail.com> wrote: > > On 04/04/2024 15:08, Ross Lagerwall wrote: > > A malicious or buggy guest may generated buffered ioreqs faster than > > QEMU can process them in handle_buffered_iopage(). The result is a > > livelock - QEMU continuously processes ioreqs on the main thread without > > iterating through the main loop which prevents handling other events, > > processing timers, etc. Without QEMU handling other events, it often > > results in the guest becoming unsable and makes it difficult to stop the > > source of buffered ioreqs. > > > > To avoid this, if we process a full page of buffered ioreqs, stop and > > reschedule an immediate timer to continue processing them. This lets > > QEMU go back to the main loop and catch up. > > > > Do PV backends potentially cause the same scheduling issue (if not using > io threads)? > From what I can tell: xen-block: It reads req_prod / req_cons once before entering the loop so it should be fine, I think. xen_console: Same as xen-block xen_nic: It reads req_prod / req_cons once before entering the loop. However, once the loop ends it checks for more requests and if there are more requests it restarts from the beginning. It seems like this could be susceptible to the same issue. (These PV backends generally aren't used by XenServer's system QEMU so I didn't spend too much time looking into it.) Thanks, Ross
On 08/04/2024 14:00, Ross Lagerwall wrote: > On Sat, Apr 6, 2024 at 11:58 AM Durrant, Paul <xadimgnik@gmail.com> wrote: >> >> On 04/04/2024 15:08, Ross Lagerwall wrote: >>> A malicious or buggy guest may generated buffered ioreqs faster than >>> QEMU can process them in handle_buffered_iopage(). The result is a >>> livelock - QEMU continuously processes ioreqs on the main thread without >>> iterating through the main loop which prevents handling other events, >>> processing timers, etc. Without QEMU handling other events, it often >>> results in the guest becoming unsable and makes it difficult to stop the >>> source of buffered ioreqs. >>> >>> To avoid this, if we process a full page of buffered ioreqs, stop and >>> reschedule an immediate timer to continue processing them. This lets >>> QEMU go back to the main loop and catch up. >>> >> >> Do PV backends potentially cause the same scheduling issue (if not using >> io threads)? >> > > From what I can tell: > > xen-block: It reads req_prod / req_cons once before entering the loop > so it should be fine, I think. > > xen_console: Same as xen-block > > xen_nic: It reads req_prod / req_cons once before entering the loop. > However, once the loop ends it checks for more requests and if there > are more requests it restarts from the beginning. It seems like this > could be susceptible to the same issue. > > (These PV backends generally aren't used by XenServer's system QEMU > so I didn't spend too much time looking into it.) > > Thanks, Ok. Thanks for checking. Paul
On 04/04/2024 15:08, Ross Lagerwall wrote: > A malicious or buggy guest may generated buffered ioreqs faster than > QEMU can process them in handle_buffered_iopage(). The result is a > livelock - QEMU continuously processes ioreqs on the main thread without > iterating through the main loop which prevents handling other events, > processing timers, etc. Without QEMU handling other events, it often > results in the guest becoming unsable and makes it difficult to stop the > source of buffered ioreqs. > > To avoid this, if we process a full page of buffered ioreqs, stop and > reschedule an immediate timer to continue processing them. This lets > QEMU go back to the main loop and catch up. > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > --- > hw/xen/xen-hvm-common.c | 26 +++++++++++++++++--------- > 1 file changed, 17 insertions(+), 9 deletions(-) > Reviewed-by: Paul Durrant <paul@xen.org>
On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote: > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > index 1627da739822..1116b3978938 100644 > --- a/hw/xen/xen-hvm-common.c > +++ b/hw/xen/xen-hvm-common.c > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state) [...] > > static void handle_buffered_io(void *opaque) > { > + unsigned int handled; > XenIOState *state = opaque; > > - if (handle_buffered_iopage(state)) { > + handled = handle_buffered_iopage(state); > + if (handled >= IOREQ_BUFFER_SLOT_NUM) { > + /* We handled a full page of ioreqs. Schedule a timer to continue > + * processing while giving other stuff a chance to run. > + */ ./scripts/checkpatch.pl report a style issue here: WARNING: Block comments use a leading /* on a separate line I can try to remember to fix that on commit. > timer_mod(state->buffered_io_timer, > - BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > - } else { > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > + } else if (handled == 0) { Just curious, why did you check for `handled == 0` here instead of `handled != 0`? That would have avoided to invert the last 2 cases, and the patch would just have introduce a new case without changing the order of the existing ones. But not that important I guess. > timer_del(state->buffered_io_timer); > qemu_xen_evtchn_unmask(state->xce_handle, state->bufioreq_local_port); > + } else { > + timer_mod(state->buffered_io_timer, > + BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > } > } Cheers,
On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD <anthony.perard@cloud.com> wrote: > > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote: > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > > index 1627da739822..1116b3978938 100644 > > --- a/hw/xen/xen-hvm-common.c > > +++ b/hw/xen/xen-hvm-common.c > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state) > [...] > > > > static void handle_buffered_io(void *opaque) > > { > > + unsigned int handled; > > XenIOState *state = opaque; > > > > - if (handle_buffered_iopage(state)) { > > + handled = handle_buffered_iopage(state); > > + if (handled >= IOREQ_BUFFER_SLOT_NUM) { > > + /* We handled a full page of ioreqs. Schedule a timer to continue > > + * processing while giving other stuff a chance to run. > > + */ > > ./scripts/checkpatch.pl report a style issue here: > WARNING: Block comments use a leading /* on a separate line > > I can try to remember to fix that on commit. I copied the comment style from a few lines above but I guess it was wrong. > > > timer_mod(state->buffered_io_timer, > > - BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > - } else { > > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > + } else if (handled == 0) { > > Just curious, why did you check for `handled == 0` here instead of > `handled != 0`? That would have avoided to invert the last 2 cases, and > the patch would just have introduce a new case without changing the > order of the existing ones. But not that important I guess. > In general I try to use conditionals with the least amount of negation since I think it is easier to read. I can change it if you would prefer? Ross
On Tue, 9 Apr 2024 at 15:20, Ross Lagerwall <ross.lagerwall@citrix.com> wrote: > > On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD <anthony.perard@cloud.com> wrote: > > > > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote: > > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > > > index 1627da739822..1116b3978938 100644 > > > --- a/hw/xen/xen-hvm-common.c > > > +++ b/hw/xen/xen-hvm-common.c > > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state) > > [...] > > > > > > static void handle_buffered_io(void *opaque) > > > { > > > + unsigned int handled; > > > XenIOState *state = opaque; > > > > > > - if (handle_buffered_iopage(state)) { > > > + handled = handle_buffered_iopage(state); > > > + if (handled >= IOREQ_BUFFER_SLOT_NUM) { > > > + /* We handled a full page of ioreqs. Schedule a timer to continue > > > + * processing while giving other stuff a chance to run. > > > + */ > > > > ./scripts/checkpatch.pl report a style issue here: > > WARNING: Block comments use a leading /* on a separate line > > > > I can try to remember to fix that on commit. > > I copied the comment style from a few lines above but I guess it was > wrong. Yes, this is one of those cases where we've settled on a style choice but there's still quite a lot of older code in the codebase that doesn't adhere to it. Checkpatch usually will catch this kind of nit for you. thanks -- PMM
On Tue, Apr 9, 2024 at 3:19 PM Ross Lagerwall <ross.lagerwall@citrix.com> wrote: > > On Tue, Apr 9, 2024 at 11:20 AM Anthony PERARD <anthony.perard@cloud.com> wrote: > > > > On Thu, Apr 04, 2024 at 03:08:33PM +0100, Ross Lagerwall wrote: > > > diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c > > > index 1627da739822..1116b3978938 100644 > > > --- a/hw/xen/xen-hvm-common.c > > > +++ b/hw/xen/xen-hvm-common.c > > > @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state) > > [...] > > > > > > static void handle_buffered_io(void *opaque) > > > { > > > + unsigned int handled; > > > XenIOState *state = opaque; > > > > > > - if (handle_buffered_iopage(state)) { > > > + handled = handle_buffered_iopage(state); > > > + if (handled >= IOREQ_BUFFER_SLOT_NUM) { > > > + /* We handled a full page of ioreqs. Schedule a timer to continue > > > + * processing while giving other stuff a chance to run. > > > + */ > > > > ./scripts/checkpatch.pl report a style issue here: > > WARNING: Block comments use a leading /* on a separate line > > > > I can try to remember to fix that on commit. > > I copied the comment style from a few lines above but I guess it was > wrong. > > > > > > timer_mod(state->buffered_io_timer, > > > - BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > > - } else { > > > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); > > > + } else if (handled == 0) { > > > > Just curious, why did you check for `handled == 0` here instead of > > `handled != 0`? That would have avoided to invert the last 2 cases, and > > the patch would just have introduce a new case without changing the > > order of the existing ones. But not that important I guess. > > > > In general I try to use conditionals with the least amount of negation > since I think it is easier to read. I can change it if you would prefer? It looks like this hasn't been committed anywhere. Were you expecting an updated version with the style issue fixed or has it fallen through the cracks? Thanks, Ross
diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c index 1627da739822..1116b3978938 100644 --- a/hw/xen/xen-hvm-common.c +++ b/hw/xen/xen-hvm-common.c @@ -463,11 +463,11 @@ static void handle_ioreq(XenIOState *state, ioreq_t *req) } } -static bool handle_buffered_iopage(XenIOState *state) +static unsigned int handle_buffered_iopage(XenIOState *state) { buffered_iopage_t *buf_page = state->buffered_io_page; buf_ioreq_t *buf_req = NULL; - bool handled_ioreq = false; + unsigned int handled = 0; ioreq_t req; int qw; @@ -480,7 +480,7 @@ static bool handle_buffered_iopage(XenIOState *state) req.count = 1; req.dir = IOREQ_WRITE; - for (;;) { + do { uint32_t rdptr = buf_page->read_pointer, wrptr; xen_rmb(); @@ -521,22 +521,30 @@ static bool handle_buffered_iopage(XenIOState *state) assert(!req.data_is_ptr); qatomic_add(&buf_page->read_pointer, qw + 1); - handled_ioreq = true; - } + handled += qw + 1; + } while (handled < IOREQ_BUFFER_SLOT_NUM); - return handled_ioreq; + return handled; } static void handle_buffered_io(void *opaque) { + unsigned int handled; XenIOState *state = opaque; - if (handle_buffered_iopage(state)) { + handled = handle_buffered_iopage(state); + if (handled >= IOREQ_BUFFER_SLOT_NUM) { + /* We handled a full page of ioreqs. Schedule a timer to continue + * processing while giving other stuff a chance to run. + */ timer_mod(state->buffered_io_timer, - BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); - } else { + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); + } else if (handled == 0) { timer_del(state->buffered_io_timer); qemu_xen_evtchn_unmask(state->xce_handle, state->bufioreq_local_port); + } else { + timer_mod(state->buffered_io_timer, + BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); } }
A malicious or buggy guest may generated buffered ioreqs faster than QEMU can process them in handle_buffered_iopage(). The result is a livelock - QEMU continuously processes ioreqs on the main thread without iterating through the main loop which prevents handling other events, processing timers, etc. Without QEMU handling other events, it often results in the guest becoming unsable and makes it difficult to stop the source of buffered ioreqs. To avoid this, if we process a full page of buffered ioreqs, stop and reschedule an immediate timer to continue processing them. This lets QEMU go back to the main loop and catch up. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> --- hw/xen/xen-hvm-common.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)