Message ID | 20220505175131.81457-1-julien@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: io: Fix race between sending an I/O and domain shutdown | expand |
On 05/05/2022 18:51, Julien Grall wrote: > From: Julien Grall <jgrall@amazon.com> > > Xen provides hypercalls to shutdown (SCHEDOP_shutdown{,_code}) and > resume a domain (XEN_DOMCTL_resumedomain). They can be used for checkpoint > where the expectation is the domain should continue as nothing happened > afterwards. > > hvmemul_do_io() and handle_pio() will act differently if the return > code of hvm_send_ioreq() (resp. hvmemul_do_pio_buffer()) is X86EMUL_RETRY. > > In this case, the I/O state will be reset to STATE_IOREQ_NONE (i.e > no I/O is pending) and/or the PC will not be advanced. > > If the shutdown request happens right after the I/O was sent to the > IOREQ, then emulation code will end up to re-execute the instruction > and therefore forward again the same I/O (at least when reading IO port). > > This would be problem if the access has a side-effect. A dumb example, > is a device implementing a counter which is incremented by one for every > access. When running shutdown/resume in a loop, the value read by the > OS may not be the old value + 1. > > Add an extra boolean in the structure hvm_vcpu_io to indicate whether > the I/O was suspend. This is then used in place of checking the domain > is shutting down in hvmemul_do_io() and handle_pio() as they should > act on suspend (i.e. vcpu_start_shutdown_deferral() returns false) rather > than shutdown. > > Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org>
Hi, On 06/05/2022 15:09, Durrant, Paul wrote: > On 05/05/2022 18:51, Julien Grall wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> Xen provides hypercalls to shutdown (SCHEDOP_shutdown{,_code}) and >> resume a domain (XEN_DOMCTL_resumedomain). They can be used for >> checkpoint >> where the expectation is the domain should continue as nothing happened >> afterwards. >> >> hvmemul_do_io() and handle_pio() will act differently if the return >> code of hvm_send_ioreq() (resp. hvmemul_do_pio_buffer()) is >> X86EMUL_RETRY. >> >> In this case, the I/O state will be reset to STATE_IOREQ_NONE (i.e >> no I/O is pending) and/or the PC will not be advanced. >> >> If the shutdown request happens right after the I/O was sent to the >> IOREQ, then emulation code will end up to re-execute the instruction >> and therefore forward again the same I/O (at least when reading IO port). >> >> This would be problem if the access has a side-effect. A dumb example, >> is a device implementing a counter which is incremented by one for every >> access. When running shutdown/resume in a loop, the value read by the >> OS may not be the old value + 1. >> >> Add an extra boolean in the structure hvm_vcpu_io to indicate whether >> the I/O was suspend. This is then used in place of checking the domain >> is shutting down in hvmemul_do_io() and handle_pio() as they should >> act on suspend (i.e. vcpu_start_shutdown_deferral() returns false) rather >> than shutdown. >> >> Signed-off-by: Julien Grall <jgrall@amazon.com> > > Reviewed-by: Paul Durrant <paul@xen.org> Thanks! I have committed it. Cheers,
diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c index bdd536e873e5..1338c86adb43 100644 --- a/xen/arch/arm/ioreq.c +++ b/xen/arch/arm/ioreq.c @@ -92,10 +92,11 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs, p.data = get_user_reg(regs, info->dabt.reg); vio->req = p; + vio->suspended = false; vio->info.dabt_instr = instr; rc = ioreq_send(s, &p, 0); - if ( rc != IO_RETRY || v->domain->is_shutting_down ) + if ( rc != IO_RETRY || vio->suspended ) vio->req.state = STATE_IOREQ_NONE; else if ( !ioreq_needs_completion(&vio->req) ) rc = IO_HANDLED; diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index e8d510e0be91..cb221f70e8f0 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -236,6 +236,7 @@ static int hvmemul_do_io( ASSERT(p.count); vio->req = p; + vio->suspended = false; rc = hvm_io_intercept(&p); @@ -331,7 +332,7 @@ static int hvmemul_do_io( else { rc = ioreq_send(s, &p, 0); - if ( rc != X86EMUL_RETRY || currd->is_shutting_down ) + if ( rc != X86EMUL_RETRY || vio->suspended ) vio->req.state = STATE_IOREQ_NONE; else if ( !ioreq_needs_completion(&vio->req) ) rc = X86EMUL_OKAY; diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index f70bfde90143..0309d05cfdfc 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -138,10 +138,11 @@ bool handle_pio(uint16_t port, unsigned int size, int dir) case X86EMUL_RETRY: /* - * We should not advance RIP/EIP if the domain is shutting down or - * if X86EMUL_RETRY has been returned by an internal handler. + * We should not advance RIP/EIP if the vio was suspended (e.g. + * because the domain is shutting down) or if X86EMUL_RETRY has + * been returned by an internal handler. */ - if ( curr->domain->is_shutting_down || !vcpu_ioreq_pending(curr) ) + if ( vio->suspended || !vcpu_ioreq_pending(curr) ) return false; break; diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c index 5c94e74293ce..4617aef29b7e 100644 --- a/xen/common/ioreq.c +++ b/xen/common/ioreq.c @@ -1251,6 +1251,7 @@ int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p, struct vcpu *curr = current; struct domain *d = curr->domain; struct ioreq_vcpu *sv; + struct vcpu_io *vio = &curr->io; ASSERT(s); @@ -1258,7 +1259,10 @@ int ioreq_send(struct ioreq_server *s, ioreq_t *proto_p, return ioreq_send_buffered(s, proto_p); if ( unlikely(!vcpu_start_shutdown_deferral(curr)) ) + { + vio->suspended = true; return IOREQ_STATUS_RETRY; + } list_for_each_entry ( sv, &s->ioreq_vcpu_list, diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index ed8539f6d297..6b724e7947c7 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -159,6 +159,11 @@ enum vio_completion { struct vcpu_io { /* I/O request in flight to device model. */ enum vio_completion completion; + /* + * Indicate whether the I/O was not handled because the domain + * is about to be paused. + */ + bool suspended; ioreq_t req; /* Arch specific info pertaining to the io request */ struct arch_vcpu_io info;