Message ID | 1425302944-6276-2-git-send-email-alex.bennee@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Alex, The subject of this change has a typo, and I also think it's not about exposing the pause state (that's just an internal name/concept), but about exposing the PSCI state, or simply the VCPU power state. On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote: > To cleanly restore an SMP VM we need to ensure that the current pause > state of each vcpu is correctly recorded. Things could get confused if > the CPU starts running after migration restore completes when it was > paused before it state was captured. > > We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64 > interface is a lot simpler as the only valid states are > KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index b112efc..602156f 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -997,7 +997,7 @@ for vm-wide capabilities. > 4.38 KVM_GET_MP_STATE > > Capability: KVM_CAP_MP_STATE > -Architectures: x86, s390 > +Architectures: x86, s390, arm, arm64 > Type: vcpu ioctl > Parameters: struct kvm_mp_state (out) > Returns: 0 on success; -1 on error > @@ -1027,15 +1027,21 @@ Possible values are: > - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state > [s390] > > -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > -in-kernel irqchip, the multiprocessing state must be maintained by userspace on > +For x86: > + > +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel > +irqchip, the multiprocessing state must be maintained by userspace on Nit: I would not taint the git log with this change but instead just introduce a paragraph below starting with "On arm/arm64, " and you would get the same effect. > these architectures. > > +For arm/arm64: > + > +The only states that are valid are KVM_MP_STATE_HALTED and > +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. As suggested on the QEMU series, HALTED is probably not the right thing to use. > > 4.39 KVM_SET_MP_STATE > > Capability: KVM_CAP_MP_STATE > -Architectures: x86, s390 > +Architectures: x86, s390, arm, arm64 > Type: vcpu ioctl > Parameters: struct kvm_mp_state (in) > Returns: 0 on success; -1 on error > @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error > Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for > arguments. > > -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > -in-kernel irqchip, the multiprocessing state must be maintained by userspace on > +For x86: > + > +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel > +irqchip, the multiprocessing state must be maintained by userspace on > these architectures. > > +For arm/arm64: > + > +The only states that are valid are KVM_MP_STATE_HALTED and > +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. same as above > > 4.40 KVM_SET_IDENTITY_MAP_ADDR > > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index 5560f74..8531536 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_ARM_PSCI: > case KVM_CAP_ARM_PSCI_0_2: > case KVM_CAP_READONLY_MEM: > + case KVM_CAP_MP_STATE: > r = 1; > break; > case KVM_CAP_COALESCED_MMIO: > @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, > int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, > struct kvm_mp_state *mp_state) > { > - return -EINVAL; > + if (vcpu->arch.pause) > + mp_state->mp_state = KVM_MP_STATE_HALTED; > + else > + mp_state->mp_state = KVM_MP_STATE_RUNNABLE; > + > + return 0; > } > > int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, > struct kvm_mp_state *mp_state) > { > - return -EINVAL; > + switch (mp_state->mp_state) { > + case KVM_MP_STATE_RUNNABLE: > + vcpu->arch.pause = false; > + break; > + case KVM_MP_STATE_HALTED: > + vcpu->arch.pause = true; > + break; > + default: > + return -EINVAL; > + } > + > + return 0; > } > > /** Are we capturing the vcpu features in some way or do we expect userspace to migrate these on its own? The reason I'm asking, is if you create multiple VCPUs, where all the non-primary VCPUs are started in power off mode, and then you boot your guest which powers on all VCPUs, and then you restart your guest (with PSCI RESET), the system will not power on all the non-primary VCPUs but hold them in power-off. We need to make sure this behavior is preserved for a reboot across a migration. Is it? Thanks, -Christoffer
Christoffer Dall <christoffer.dall@linaro.org> writes: > Hi Alex, > > The subject of this change has a typo, and I also think it's not about > exposing the pause state (that's just an internal name/concept), but > about exposing the PSCI state, or simply the VCPU power state. arm: KVM: export VCPU power state via MP_STATE ioctl? > > On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote: >> To cleanly restore an SMP VM we need to ensure that the current pause >> state of each vcpu is correctly recorded. Things could get confused if >> the CPU starts running after migration restore completes when it was >> paused before it state was captured. >> >> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64 >> interface is a lot simpler as the only valid states are >> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED. >> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt >> index b112efc..602156f 100644 >> --- a/Documentation/virtual/kvm/api.txt >> +++ b/Documentation/virtual/kvm/api.txt >> @@ -997,7 +997,7 @@ for vm-wide capabilities. >> 4.38 KVM_GET_MP_STATE >> >> Capability: KVM_CAP_MP_STATE >> -Architectures: x86, s390 >> +Architectures: x86, s390, arm, arm64 >> Type: vcpu ioctl >> Parameters: struct kvm_mp_state (out) >> Returns: 0 on success; -1 on error >> @@ -1027,15 +1027,21 @@ Possible values are: >> - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state >> [s390] >> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on >> +For x86: >> + >> +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel >> +irqchip, the multiprocessing state must be maintained by userspace on > > Nit: I would not taint the git log with this change but instead just > introduce a paragraph below starting with "On arm/arm64, " and you would > get the same effect. > >> these architectures. >> >> +For arm/arm64: >> + >> +The only states that are valid are KVM_MP_STATE_HALTED and >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. > > As suggested on the QEMU series, HALTED is probably not the right thing > to use. KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit. I'm wary of adding yet another define. > >> >> 4.39 KVM_SET_MP_STATE >> >> Capability: KVM_CAP_MP_STATE >> -Architectures: x86, s390 >> +Architectures: x86, s390, arm, arm64 >> Type: vcpu ioctl >> Parameters: struct kvm_mp_state (in) >> Returns: 0 on success; -1 on error >> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error >> Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for >> arguments. >> >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on >> +For x86: >> + >> +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel >> +irqchip, the multiprocessing state must be maintained by userspace on >> these architectures. >> >> +For arm/arm64: >> + >> +The only states that are valid are KVM_MP_STATE_HALTED and >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. > > same as above > >> >> 4.40 KVM_SET_IDENTITY_MAP_ADDR >> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c >> index 5560f74..8531536 100644 >> --- a/arch/arm/kvm/arm.c >> +++ b/arch/arm/kvm/arm.c >> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) >> case KVM_CAP_ARM_PSCI: >> case KVM_CAP_ARM_PSCI_0_2: >> case KVM_CAP_READONLY_MEM: >> + case KVM_CAP_MP_STATE: >> r = 1; >> break; >> case KVM_CAP_COALESCED_MMIO: >> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, >> int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, >> struct kvm_mp_state *mp_state) >> { >> - return -EINVAL; >> + if (vcpu->arch.pause) >> + mp_state->mp_state = KVM_MP_STATE_HALTED; >> + else >> + mp_state->mp_state = KVM_MP_STATE_RUNNABLE; >> + >> + return 0; >> } >> >> int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, >> struct kvm_mp_state *mp_state) >> { >> - return -EINVAL; >> + switch (mp_state->mp_state) { >> + case KVM_MP_STATE_RUNNABLE: >> + vcpu->arch.pause = false; >> + break; >> + case KVM_MP_STATE_HALTED: >> + vcpu->arch.pause = true; >> + break; >> + default: >> + return -EINVAL; >> + } >> + >> + return 0; >> } >> >> /** > > Are we capturing the vcpu features in some way or do we expect userspace > to migrate these on its own? The reason I'm asking, is if you create > multiple VCPUs, where all the non-primary VCPUs are started in power off > mode, and then you boot your guest which powers on all VCPUs, and then > you restart your guest (with PSCI RESET), the system will not power on > all the non-primary VCPUs but hold them in power-off. > > We need to make sure this behavior is preserved for a reboot across a > migration. Is it? Isn't that behaviour orthogonal to the migration case? - Boot - Power on secondary CPUs - Power off one secondary CPU - Migrate to file (cpu_powered reflects state of each CPU) - Start fresh QEMU - Restore from file (cpu_powered -> vcpu->paused via ioctl) - Run (we continue with the same power state pre-migrate) - PSCI RESET - Does what it does, power all secondaries down? - Kernel boots, turns them on? > > Thanks, > -Christoffer
On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Bennée wrote: > > Christoffer Dall <christoffer.dall@linaro.org> writes: > > > Hi Alex, > > > > The subject of this change has a typo, and I also think it's not about > > exposing the pause state (that's just an internal name/concept), but > > about exposing the PSCI state, or simply the VCPU power state. > > arm: KVM: export VCPU power state via MP_STATE ioctl? > arm/arm64: KVM: otherwise looks good to me ;) > > > > On Mon, Mar 02, 2015 at 01:29:00PM +0000, Alex Bennée wrote: > >> To cleanly restore an SMP VM we need to ensure that the current pause > >> state of each vcpu is correctly recorded. Things could get confused if > >> the CPU starts running after migration restore completes when it was > >> paused before it state was captured. > >> > >> We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64 > >> interface is a lot simpler as the only valid states are > >> KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED. > >> > >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > >> > >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > >> index b112efc..602156f 100644 > >> --- a/Documentation/virtual/kvm/api.txt > >> +++ b/Documentation/virtual/kvm/api.txt > >> @@ -997,7 +997,7 @@ for vm-wide capabilities. > >> 4.38 KVM_GET_MP_STATE > >> > >> Capability: KVM_CAP_MP_STATE > >> -Architectures: x86, s390 > >> +Architectures: x86, s390, arm, arm64 > >> Type: vcpu ioctl > >> Parameters: struct kvm_mp_state (out) > >> Returns: 0 on success; -1 on error > >> @@ -1027,15 +1027,21 @@ Possible values are: > >> - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state > >> [s390] > >> > >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on > >> +For x86: > >> + > >> +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel > >> +irqchip, the multiprocessing state must be maintained by userspace on > > > > Nit: I would not taint the git log with this change but instead just > > introduce a paragraph below starting with "On arm/arm64, " and you would > > get the same effect. > > > >> these architectures. > >> > >> +For arm/arm64: > >> + > >> +The only states that are valid are KVM_MP_STATE_HALTED and > >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. > > > > As suggested on the QEMU series, HALTED is probably not the right thing > > to use. > > KVM_MP_STATE_STOPPED is currently only used for s390 but seems to fit. > I'm wary of adding yet another define. > sounds fine, as long as it doesn't have some inherently different meaning with s390. > > > >> > >> 4.39 KVM_SET_MP_STATE > >> > >> Capability: KVM_CAP_MP_STATE > >> -Architectures: x86, s390 > >> +Architectures: x86, s390, arm, arm64 > >> Type: vcpu ioctl > >> Parameters: struct kvm_mp_state (in) > >> Returns: 0 on success; -1 on error > >> @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error > >> Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for > >> arguments. > >> > >> -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > >> -in-kernel irqchip, the multiprocessing state must be maintained by userspace on > >> +For x86: > >> + > >> +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel > >> +irqchip, the multiprocessing state must be maintained by userspace on > >> these architectures. > >> > >> +For arm/arm64: > >> + > >> +The only states that are valid are KVM_MP_STATE_HALTED and > >> +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. > > > > same as above > > > >> > >> 4.40 KVM_SET_IDENTITY_MAP_ADDR > >> > >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > >> index 5560f74..8531536 100644 > >> --- a/arch/arm/kvm/arm.c > >> +++ b/arch/arm/kvm/arm.c > >> @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > >> case KVM_CAP_ARM_PSCI: > >> case KVM_CAP_ARM_PSCI_0_2: > >> case KVM_CAP_READONLY_MEM: > >> + case KVM_CAP_MP_STATE: > >> r = 1; > >> break; > >> case KVM_CAP_COALESCED_MMIO: > >> @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, > >> int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, > >> struct kvm_mp_state *mp_state) > >> { > >> - return -EINVAL; > >> + if (vcpu->arch.pause) > >> + mp_state->mp_state = KVM_MP_STATE_HALTED; > >> + else > >> + mp_state->mp_state = KVM_MP_STATE_RUNNABLE; > >> + > >> + return 0; > >> } > >> > >> int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, > >> struct kvm_mp_state *mp_state) > >> { > >> - return -EINVAL; > >> + switch (mp_state->mp_state) { > >> + case KVM_MP_STATE_RUNNABLE: > >> + vcpu->arch.pause = false; > >> + break; > >> + case KVM_MP_STATE_HALTED: > >> + vcpu->arch.pause = true; > >> + break; > >> + default: > >> + return -EINVAL; > >> + } > >> + > >> + return 0; > >> } > >> > >> /** > > > > Are we capturing the vcpu features in some way or do we expect userspace > > to migrate these on its own? The reason I'm asking, is if you create > > multiple VCPUs, where all the non-primary VCPUs are started in power off > > mode, and then you boot your guest which powers on all VCPUs, and then > > you restart your guest (with PSCI RESET), the system will not power on > > all the non-primary VCPUs but hold them in power-off. > > > > We need to make sure this behavior is preserved for a reboot across a > > migration. Is it? > > Isn't that behaviour orthogonal to the migration case? > > - Boot > - Power on secondary CPUs > - Power off one secondary CPU > - Migrate to file (cpu_powered reflects state of each CPU) > > - Start fresh QEMU > - Restore from file (cpu_powered -> vcpu->paused via ioctl) > - Run (we continue with the same power state pre-migrate) > > - PSCI RESET > - Does what it does, power all secondaries down? > - Kernel boots, turns them on? > Hmmm. As long as QEMU always inits all VCPUs in the exact same way (including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and that's probably a reasonable requirement for migration. -Christoffer
On 10 March 2015 at 04:29, Christoffer Dall <christoffer.dall@linaro.org> wrote: > On Mon, Mar 09, 2015 at 04:34:21PM +0000, Alex Bennée wrote: >> - Boot >> - Power on secondary CPUs >> - Power off one secondary CPU >> - Migrate to file (cpu_powered reflects state of each CPU) >> >> - Start fresh QEMU >> - Restore from file (cpu_powered -> vcpu->paused via ioctl) >> - Run (we continue with the same power state pre-migrate) >> >> - PSCI RESET >> - Does what it does, power all secondaries down? >> - Kernel boots, turns them on? >> > Hmmm. As long as QEMU always inits all VCPUs in the exact same way > (including the KVM_ARM_VCPU_POWER_OFF feature bit) I guess it works and > that's probably a reasonable requirement for migration. We init the VCPUs with the POWER_OFF flag set for exactly the set of CPUs that we want to start powered off. Typically that means that the first CPU is inited with POWER_OFF clear and the rest are inited with it set. Regardless, if we're doing an incoming migration, then the incoming data will be used to set the VCPU on/off state appropriately for resuming the VM. (This might include powering on a VCPU that's been inited in power-off and never run, or powering down a VCPU that was inited power-on but never ran. In the 'migration moves a vcpu to powered-on' case we'll also set the vcpu's PC and other registers so when it is run it will DTRT.) If the resumed guest subsequently does a PSCI reset then QEMU will re-init the VCPUs with the set of feature bits that the machine model/etc defines for a reset condition for this board, which will be the same "first CPU powered on, all the rest powered off" config we started with. (It's the user's responsibility to ensure that when doing a migration the QEMUs at both ends are identically configured, incidentally.) -- PMM
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index b112efc..602156f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -997,7 +997,7 @@ for vm-wide capabilities. 4.38 KVM_GET_MP_STATE Capability: KVM_CAP_MP_STATE -Architectures: x86, s390 +Architectures: x86, s390, arm, arm64 Type: vcpu ioctl Parameters: struct kvm_mp_state (out) Returns: 0 on success; -1 on error @@ -1027,15 +1027,21 @@ Possible values are: - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state [s390] -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an -in-kernel irqchip, the multiprocessing state must be maintained by userspace on +For x86: + +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel +irqchip, the multiprocessing state must be maintained by userspace on these architectures. +For arm/arm64: + +The only states that are valid are KVM_MP_STATE_HALTED and +KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. 4.39 KVM_SET_MP_STATE Capability: KVM_CAP_MP_STATE -Architectures: x86, s390 +Architectures: x86, s390, arm, arm64 Type: vcpu ioctl Parameters: struct kvm_mp_state (in) Returns: 0 on success; -1 on error @@ -1043,10 +1049,16 @@ Returns: 0 on success; -1 on error Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for arguments. -On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an -in-kernel irqchip, the multiprocessing state must be maintained by userspace on +For x86: + +This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel +irqchip, the multiprocessing state must be maintained by userspace on these architectures. +For arm/arm64: + +The only states that are valid are KVM_MP_STATE_HALTED and +KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. 4.40 KVM_SET_IDENTITY_MAP_ADDR diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 5560f74..8531536 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -183,6 +183,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_PSCI: case KVM_CAP_ARM_PSCI_0_2: case KVM_CAP_READONLY_MEM: + case KVM_CAP_MP_STATE: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -313,13 +314,29 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { - return -EINVAL; + if (vcpu->arch.pause) + mp_state->mp_state = KVM_MP_STATE_HALTED; + else + mp_state->mp_state = KVM_MP_STATE_RUNNABLE; + + return 0; } int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { - return -EINVAL; + switch (mp_state->mp_state) { + case KVM_MP_STATE_RUNNABLE: + vcpu->arch.pause = false; + break; + case KVM_MP_STATE_HALTED: + vcpu->arch.pause = true; + break; + default: + return -EINVAL; + } + + return 0; } /**
To cleanly restore an SMP VM we need to ensure that the current pause state of each vcpu is correctly recorded. Things could get confused if the CPU starts running after migration restore completes when it was paused before it state was captured. We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64 interface is a lot simpler as the only valid states are KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_HALTED. Signed-off-by: Alex Bennée <alex.bennee@linaro.org>