diff mbox

[3/3] arm64: kvm: Fix single step for guest skipped instructions

Message ID 87k20bxm13.fsf@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Alex Bennée Oct. 4, 2017, 10:08 a.m. UTC
[Added Paolo, including QEMU userspace patch]

Julien Thierry <julien.thierry@arm.com> writes:

> On 03/10/17 18:26, Alex Bennée wrote:
>>
>> Julien Thierry <julien.thierry@arm.com> writes:
>>
>>> On 03/10/17 17:30, Alex Bennée wrote:
>>>>
>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>
>>>>> On 03/10/17 15:57, Alex Bennée wrote:
>>>>>>
>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>
>>>>>>> On 31/08/17 15:01, Christoffer Dall wrote:
>>>> <snip>
>>>>>>>>>>>
>>>>>>>>>>> Does that sound like what you had in mind? Or does it seem better than
>>>>>>>>>>> the
>>>>>>>>>>> current patch?
>>>>>>>>>>>
>>>>>>>>>> I was thinking to change the skip_instruction function to return an
>>>>>>>>>> int, and then call kvm_handle_debug_ss() from skip_instruction, which
>>>>>>>>>> would update the kvm_run structure and exit here and then.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Setting up the debug exception from within kvm_skip_instruction seem to
>>>>>>>>> change a bit too much its semantic from arm to arm64. I would find this
>>>>>>>>> easily confusing.
>>>>>>>>>
>>>>>>>>>> However, I'm now thinking that this doesn't really work either,
>>>>>>>>>> because we could have to emulate a trapped MMIO instruction in user
>>>>>>>>>> space, and then it's not clear how to exit with a debug exception at
>>>>>>>>>> the same time.
>>>>>>
>>>>>> A debug exception at guest exit point is (IIRC) just having the
>>>>>> appropriate status in the run->exit_reason (KVM_EXIT_DEBUG). If you need
>>>>>> to exit for MMIO emulation (i.e. the instruction has not run yet) you
>>>>>> shouldn't do that. Exit, emulate and return. We could handle the ioctl
>>>>>> to clear SS in userspace but I guess that gets just as messy.
>>>>>>
>>>>>>>>>>
>>>>>>>>>> So perhaps we should stick with your original approach.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I had not realized that was possible. This makes things more complicated for
>>>>>>>>> avoiding a back and forth with the guest for trapped exceptions. Out of
>>>>>>>>> luck, having the debug flag does look like single stepping would work as
>>>>>>>>> expected for userland MMIOs.
>>>>
>>>> <snip>
>>>>
>>>> This is my currently untested but otherwise simpler solution:
>>>>
>>>>   From 46ea80d7dc9b98661fcd51c41090f8ad74a6690f Mon Sep 17 00:00:00 2001
>>>> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
>>>> Date: Tue, 3 Oct 2017 17:17:15 +0100
>>>> Subject: [PATCH] KVM: arm64: handle single-stepping trapped instructions
>>>> MIME-Version: 1.0
>>>> Content-Type: text/plain; charset=UTF-8
>>>> Content-Transfer-Encoding: 8bit
>>>>
>>>> If we are using guest debug to single-step the guest we need to ensure
>>>> we exit after emulating the instruction. This only affects
>>>> instructions emulated by the kernel. If we exit to userspace anyway we
>>>> leave it to userspace to work out what to do.
>>>>
>>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>> ---
>>>>    arch/arm64/kvm/handle_exit.c | 51 ++++++++++++++++++++++++++++++++------------
>>>>    1 file changed, 37 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>>>> index 7debb74843a0..b197ffb10e96 100644
>>>> --- a/arch/arm64/kvm/handle_exit.c
>>>> +++ b/arch/arm64/kvm/handle_exit.c
>>>> @@ -178,6 +178,42 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
>>>>    	return arm_exit_handlers[hsr_ec];
>>>>    }
>>>>
>>>> +/*
>>>> + * When handling traps we need to ensure exit the guest if we
>>>> + * successfully emulated the instruction while single-stepping. If we
>>>> + * have to exit anyway for userspace emulation then it's up to
>>>> + * userspace to handle the "while SSing case".
>>>> + */
>>>> +
>>>
>>> I have not tested the code but if it work we also need to do something
>>> similar for MMIOs that are handled by the kernel (without returning to
>>> userland). But it should be pretty similar.
>> <snip>
>>
>> Which path do they take to the mmio emulation?
>>
>
> Nevermind, I was mistaken, mmio code will get called by exit_handler
> and "handled" will be true if the mmio was handled by KVM. So your
> patch already handles this.
>
> There is also the case of GIC CPU inteface accesses being trapped
> (which shouldn't be the default behaviour). If the vgic access fails,
> we will skip the instruction (in __kvm_vcpu_run in
> arch/arm64/kvm/hyp/switch.c) and if we were single stepping we will
> single step 2 instructions. But this seems to be a corner case of a
> corner case (GIC CPUIF trapped + access failing + single stepping), so
> I don't know if we want to take that into account right now?

Yeah looking at the hyp code I did wonder if it warranted the complexity
of adding handling there.

> I'm still a bit concerned about the change of semantics for
> KVM_EXIT_MMIO with regards to userland (cf. my previous mail). But if
> this is deemed to be a reasonable change, the patch seems fine to me.

Have we changed the semantics? A normal un-handled by the kernel IO/MMIO
exit needs to be processed before the single step takes effect. I can't
speak for other userspace but I think for QEMU it is as simple as the
patch bellow. That said it wasn't quite clear where the PC gets updated
in this path - I think the kernel updates the PC before the
KVM_EXIT_MMIO in the same path as the internal handling.

I'd like to test these patches on some real life examples. I tried
tracing over the pl011_write code in a kernel boot but I ran into the
problem of el1_irq's occurring as I tried to step through the guest
kernel. Is this something you've come across? What MMIO accesses have
you been using in your testing?

QEMU Patch bellow:

From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
Date: Wed, 4 Oct 2017 09:49:41 +0000
Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

If single-stepping is enabled we should exit the run-loop after
emulating the access. Otherwise single-stepping across emulated IO
accesses may skip an instruction.

This only addresses user-space emulation. Stuff done in kernel-mode
should be handled there.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 accel/kvm/kvm-all.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--
2.14.1




>
> Thanks,


--
Alex Bennée

Comments

Paolo Bonzini Oct. 4, 2017, 10:28 a.m. UTC | #1
On 04/10/2017 12:08, Alex Bennée wrote:
> 
> From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
> Date: Wed, 4 Oct 2017 09:49:41 +0000
> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> If single-stepping is enabled we should exit the run-loop after
> emulating the access. Otherwise single-stepping across emulated IO
> accesses may skip an instruction.
> 
> This only addresses user-space emulation. Stuff done in kernel-mode
> should be handled there.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  accel/kvm/kvm-all.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 90c88b517d..85bcb2b0d4 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>                            run->io.direction,
>                            run->io.size,
>                            run->io.count);
> -            ret = 0;
> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>              break;
>          case KVM_EXIT_MMIO:
>              DPRINTF("handle_mmio\n");
> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>                               run->mmio.data,
>                               run->mmio.len,
>                               run->mmio.is_write);
> -            ret = 0;
> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>              break;
>          case KVM_EXIT_IRQ_WINDOW_OPEN:
>              DPRINTF("irq_window_open\n");

Singlestep mode doesn't make much sense for KVM.  For TCG the purpose is
to build one-instruction translation blocks, but what would it mean for KVM?

Paolo
Julien Thierry Oct. 4, 2017, 10:42 a.m. UTC | #2
On 04/10/17 11:08, Alex Bennée wrote:
> 
> [Added Paolo, including QEMU userspace patch]
> 
> Julien Thierry <julien.thierry@arm.com> writes:
> 
>> On 03/10/17 18:26, Alex Bennée wrote:
>>>
>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>
>>>> On 03/10/17 17:30, Alex Bennée wrote:
>>>>>
>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>
>>>>>> On 03/10/17 15:57, Alex Bennée wrote:
>>>>>>>
>>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>>
>>>>>>>> On 31/08/17 15:01, Christoffer Dall wrote:
>>>>> <snip>
>>>>>>>>>>>>
>>>>>>>>>>>> Does that sound like what you had in mind? Or does it seem better than
>>>>>>>>>>>> the
>>>>>>>>>>>> current patch?
>>>>>>>>>>>>
>>>>>>>>>>> I was thinking to change the skip_instruction function to return an
>>>>>>>>>>> int, and then call kvm_handle_debug_ss() from skip_instruction, which
>>>>>>>>>>> would update the kvm_run structure and exit here and then.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Setting up the debug exception from within kvm_skip_instruction seem to
>>>>>>>>>> change a bit too much its semantic from arm to arm64. I would find this
>>>>>>>>>> easily confusing.
>>>>>>>>>>
>>>>>>>>>>> However, I'm now thinking that this doesn't really work either,
>>>>>>>>>>> because we could have to emulate a trapped MMIO instruction in user
>>>>>>>>>>> space, and then it's not clear how to exit with a debug exception at
>>>>>>>>>>> the same time.
>>>>>>>
>>>>>>> A debug exception at guest exit point is (IIRC) just having the
>>>>>>> appropriate status in the run->exit_reason (KVM_EXIT_DEBUG). If you need
>>>>>>> to exit for MMIO emulation (i.e. the instruction has not run yet) you
>>>>>>> shouldn't do that. Exit, emulate and return. We could handle the ioctl
>>>>>>> to clear SS in userspace but I guess that gets just as messy.
>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So perhaps we should stick with your original approach.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I had not realized that was possible. This makes things more complicated for
>>>>>>>>>> avoiding a back and forth with the guest for trapped exceptions. Out of
>>>>>>>>>> luck, having the debug flag does look like single stepping would work as
>>>>>>>>>> expected for userland MMIOs.
>>>>>
>>>>> <snip>
>>
>> There is also the case of GIC CPU inteface accesses being trapped
>> (which shouldn't be the default behaviour). If the vgic access fails,
>> we will skip the instruction (in __kvm_vcpu_run in
>> arch/arm64/kvm/hyp/switch.c) and if we were single stepping we will
>> single step 2 instructions. But this seems to be a corner case of a
>> corner case (GIC CPUIF trapped + access failing + single stepping), so
>> I don't know if we want to take that into account right now?
> 
> Yeah looking at the hyp code I did wonder if it warranted the complexity
> of adding handling there.
> 
>> I'm still a bit concerned about the change of semantics for
>> KVM_EXIT_MMIO with regards to userland (cf. my previous mail). But if
>> this is deemed to be a reasonable change, the patch seems fine to me.
> 
> Have we changed the semantics? A normal un-handled by the kernel IO/MMIO
> exit needs to be processed before the single step takes effect. I can't
> speak for other userspace but I think for QEMU it is as simple as the
> patch bellow. That said it wasn't quite clear where the PC gets updated
> in this path - I think the kernel updates the PC before the
> KVM_EXIT_MMIO in the same path as the internal handling.
> 

Well I'm not sure. The part I am concerned about is:
> NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
>       KVM_EXIT_EPR the corresponding
> operations are complete (and guest state is consistent) only after userspace
> has re-entered the kernel with KVM_RUN.  The kernel side will first finish
> incomplete operations and then check for pending signals.  Userspace
> can re-enter the guest with an unmasked signal pending to complete
> pending operations.

 From Documentation/virtual/kvm/api.txt.

The way I interpret this is that userland should not consider the MMIO 
complete before running the vcpu again. If that the case it shouldn't 
trigger the single step since the instruction is not completely finished.

Maybe I don't interpret this correctly or it is not relevant here. 
Although I'd like to understand why.

> I'd like to test these patches on some real life examples. I tried
> tracing over the pl011_write code in a kernel boot but I ran into the
> problem of el1_irq's occurring as I tried to step through the guest
> kernel. Is this something you've come across? What MMIO accesses have
> you been using in your testing?
> 

I didn't know which MMIOs were handled by userland so I have only tested 
traps and MMIOs handled by the kernel.

This sounds like an issue when you are debugging an interruptible 
context, an issue Pratyush has been looking at [1]. Are you taking a 
guest interrupt when you try to execute the instruction to be stepped? I 
don't know what's the status of this patch series. Can you test the 
userland MMIO in a non-interruptible context?

[1] 
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/517234.html

Thanks,

> QEMU Patch bellow:
> 
>  From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
> Date: Wed, 4 Oct 2017 09:49:41 +0000
> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> If single-stepping is enabled we should exit the run-loop after
> emulating the access. Otherwise single-stepping across emulated IO
> accesses may skip an instruction.
> 
> This only addresses user-space emulation. Stuff done in kernel-mode
> should be handled there.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>   accel/kvm/kvm-all.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 90c88b517d..85bcb2b0d4 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>                             run->io.direction,
>                             run->io.size,
>                             run->io.count);
> -            ret = 0;
> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>               break;
>           case KVM_EXIT_MMIO:
>               DPRINTF("handle_mmio\n");
> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>                                run->mmio.data,
>                                run->mmio.len,
>                                run->mmio.is_write);
> -            ret = 0;
> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>               break;
>           case KVM_EXIT_IRQ_WINDOW_OPEN:
>               DPRINTF("irq_window_open\n");
> --
> 2.14.1
>
Alex Bennée Oct. 4, 2017, 10:50 a.m. UTC | #3
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 04/10/2017 12:08, Alex Bennée wrote:
>>
>> From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
>> Date: Wed, 4 Oct 2017 09:49:41 +0000
>> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> If single-stepping is enabled we should exit the run-loop after
>> emulating the access. Otherwise single-stepping across emulated IO
>> accesses may skip an instruction.
>>
>> This only addresses user-space emulation. Stuff done in kernel-mode
>> should be handled there.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>  accel/kvm/kvm-all.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 90c88b517d..85bcb2b0d4 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>                            run->io.direction,
>>                            run->io.size,
>>                            run->io.count);
>> -            ret = 0;
>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>              break;
>>          case KVM_EXIT_MMIO:
>>              DPRINTF("handle_mmio\n");
>> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>                               run->mmio.data,
>>                               run->mmio.len,
>>                               run->mmio.is_write);
>> -            ret = 0;
>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>              break;
>>          case KVM_EXIT_IRQ_WINDOW_OPEN:
>>              DPRINTF("irq_window_open\n");
>
> Singlestep mode doesn't make much sense for KVM.  For TCG the purpose is
> to build one-instruction translation blocks, but what would it mean for KVM?

It's used by the kvm_arch_handle_debug() code to verify single-stepping
is enabled when processing debug exceptions. And also kvm_update_debug:

    if (cpu->singlestep_enabled) {
        data.dbg.control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
    }

We also have an aliased singlestep_enabled in our disas_context for the
translator.

--
Alex Bennée
Paolo Bonzini Oct. 4, 2017, 2:19 p.m. UTC | #4
On 04/10/2017 12:50, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 04/10/2017 12:08, Alex Bennée wrote:
>>>
>>> From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
>>> Date: Wed, 4 Oct 2017 09:49:41 +0000
>>> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=UTF-8
>>> Content-Transfer-Encoding: 8bit
>>>
>>> If single-stepping is enabled we should exit the run-loop after
>>> emulating the access. Otherwise single-stepping across emulated IO
>>> accesses may skip an instruction.
>>>
>>> This only addresses user-space emulation. Stuff done in kernel-mode
>>> should be handled there.
>>>
>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>> ---
>>>  accel/kvm/kvm-all.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>>> index 90c88b517d..85bcb2b0d4 100644
>>> --- a/accel/kvm/kvm-all.c
>>> +++ b/accel/kvm/kvm-all.c
>>> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>>                            run->io.direction,
>>>                            run->io.size,
>>>                            run->io.count);
>>> -            ret = 0;
>>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>>              break;
>>>          case KVM_EXIT_MMIO:
>>>              DPRINTF("handle_mmio\n");
>>> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>>                               run->mmio.data,
>>>                               run->mmio.len,
>>>                               run->mmio.is_write);
>>> -            ret = 0;
>>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>>              break;
>>>          case KVM_EXIT_IRQ_WINDOW_OPEN:
>>>              DPRINTF("irq_window_open\n");
>>
>> Singlestep mode doesn't make much sense for KVM.  For TCG the purpose is
>> to build one-instruction translation blocks, but what would it mean for KVM?
> 
> It's used by the kvm_arch_handle_debug() code to verify single-stepping
> is enabled when processing debug exceptions. And also kvm_update_debug:
> 
>     if (cpu->singlestep_enabled) {
>         data.dbg.control |= KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
>     }
> 
> We also have an aliased singlestep_enabled in our disas_context for the
> translator.

Nevermind, I was confusing cpu->singlestep_enabled with the "singlestep"
global.

Paolo
Alex Bennée Oct. 4, 2017, 3:42 p.m. UTC | #5
Julien Thierry <julien.thierry@arm.com> writes:

> On 04/10/17 11:08, Alex Bennée wrote:
>>
>> [Added Paolo, including QEMU userspace patch]
>>
>> Julien Thierry <julien.thierry@arm.com> writes:
>>
>>> On 03/10/17 18:26, Alex Bennée wrote:
>>>>
>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>
>>>>> On 03/10/17 17:30, Alex Bennée wrote:
>>>>>>
>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>
>>>>>>> On 03/10/17 15:57, Alex Bennée wrote:
>>>>>>>>
>>>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>>>
>>>>>>>>> On 31/08/17 15:01, Christoffer Dall wrote:
>>>>>> <snip>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does that sound like what you had in mind? Or does it seem better than
>>>>>>>>>>>>> the
>>>>>>>>>>>>> current patch?
>>>>>>>>>>>>>
>>>>>>>>>>>> I was thinking to change the skip_instruction function to return an
>>>>>>>>>>>> int, and then call kvm_handle_debug_ss() from skip_instruction, which
>>>>>>>>>>>> would update the kvm_run structure and exit here and then.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Setting up the debug exception from within kvm_skip_instruction seem to
>>>>>>>>>>> change a bit too much its semantic from arm to arm64. I would find this
>>>>>>>>>>> easily confusing.
>>>>>>>>>>>
>>>>>>>>>>>> However, I'm now thinking that this doesn't really work either,
>>>>>>>>>>>> because we could have to emulate a trapped MMIO instruction in user
>>>>>>>>>>>> space, and then it's not clear how to exit with a debug exception at
>>>>>>>>>>>> the same time.
>>>>>>>>
>>>>>>>> A debug exception at guest exit point is (IIRC) just having the
>>>>>>>> appropriate status in the run->exit_reason (KVM_EXIT_DEBUG). If you need
>>>>>>>> to exit for MMIO emulation (i.e. the instruction has not run yet) you
>>>>>>>> shouldn't do that. Exit, emulate and return. We could handle the ioctl
>>>>>>>> to clear SS in userspace but I guess that gets just as messy.
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> So perhaps we should stick with your original approach.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I had not realized that was possible. This makes things more complicated for
>>>>>>>>>>> avoiding a back and forth with the guest for trapped exceptions. Out of
>>>>>>>>>>> luck, having the debug flag does look like single stepping would work as
>>>>>>>>>>> expected for userland MMIOs.
>>>>>>
>>>>>> <snip>
>>>
>>> There is also the case of GIC CPU inteface accesses being trapped
>>> (which shouldn't be the default behaviour). If the vgic access fails,
>>> we will skip the instruction (in __kvm_vcpu_run in
>>> arch/arm64/kvm/hyp/switch.c) and if we were single stepping we will
>>> single step 2 instructions. But this seems to be a corner case of a
>>> corner case (GIC CPUIF trapped + access failing + single stepping), so
>>> I don't know if we want to take that into account right now?
>>
>> Yeah looking at the hyp code I did wonder if it warranted the complexity
>> of adding handling there.
>>
>>> I'm still a bit concerned about the change of semantics for
>>> KVM_EXIT_MMIO with regards to userland (cf. my previous mail). But if
>>> this is deemed to be a reasonable change, the patch seems fine to me.
>>
>> Have we changed the semantics? A normal un-handled by the kernel IO/MMIO
>> exit needs to be processed before the single step takes effect. I can't
>> speak for other userspace but I think for QEMU it is as simple as the
>> patch bellow. That said it wasn't quite clear where the PC gets updated
>> in this path - I think the kernel updates the PC before the
>> KVM_EXIT_MMIO in the same path as the internal handling.
>>
>
> Well I'm not sure. The part I am concerned about is:
>> NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
>>       KVM_EXIT_EPR the corresponding
>> operations are complete (and guest state is consistent) only after userspace
>> has re-entered the kernel with KVM_RUN.  The kernel side will first finish
>> incomplete operations and then check for pending signals.  Userspace
>> can re-enter the guest with an unmasked signal pending to complete
>> pending operations.
>
> From Documentation/virtual/kvm/api.txt.
>
> The way I interpret this is that userland should not consider the MMIO
> complete before running the vcpu again. If that the case it shouldn't
> trigger the single step since the instruction is not completely
> finished.
>
> Maybe I don't interpret this correctly or it is not relevant here.
> Although I'd like to understand why.

No it certainly needs clarifying. The comment was originally added in:

  679613442f84702c06a80f2320cb8a50089200bc

Looking more closely though it has a point. The IO/MMIO exits work
purely from the address and data entries in the run structure. When we
return to KVM_RUN we do:

	if (run->exit_reason == KVM_EXIT_MMIO) {
		ret = kvm_handle_mmio_return(vcpu, vcpu->run);
		if (ret)
			return ret;
	}

So you are correct the instruction emulation is not complete. Once that
fixup is done however I think we are good to return. So perhaps we can
avoid involving QEMU entirely in this by generating a debug exit
here.

QEMU ->
        kvm_run ->
                    switch ->
                              guest
                           <-
                    trap
                 <-
        exit mmio
     <-
QEMU
     -> kvm_run
        handle_mmio_return
        exit debug
     <-
QEMU

I don't think this affects the handle_exit() case as we only force the
exit for successfully emulated instructions inside kvm.

Looking at x86 for reference it does seem happy with triggering exits on
single stepped emulation, see kvm_vcpu_do_singlestep(). However it also
has a number of comments on IO emulation:

  /* FIXME: return into emulator if single-stepping.  */

So ARM is at least not behind the curve on this support ;-)

>> I'd like to test these patches on some real life examples. I tried
>> tracing over the pl011_write code in a kernel boot but I ran into the
>> problem of el1_irq's occurring as I tried to step through the guest
>> kernel. Is this something you've come across? What MMIO accesses have
>> you been using in your testing?
>>
>
> I didn't know which MMIOs were handled by userland so I have only
> tested traps and MMIOs handled by the kernel.

Any particular MMIOs I could also use to replicate in my tests?

> This sounds like an issue when you are debugging an interruptible
> context, an issue Pratyush has been looking at [1]. Are you taking a
> guest interrupt when you try to execute the instruction to be stepped?
> I don't know what's the status of this patch series. Can you test the
> userland MMIO in a non-interruptible context?
>
> [1]
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/517234.html

Again looking at x86 it looks like the approach is to suspend IRQs if
you are single-stepping. I'll have a look at Pratyush's patches.

>
> Thanks,
>
>> QEMU Patch bellow:
>>
>>  From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
>> Date: Wed, 4 Oct 2017 09:49:41 +0000
>> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> If single-stepping is enabled we should exit the run-loop after
>> emulating the access. Otherwise single-stepping across emulated IO
>> accesses may skip an instruction.
>>
>> This only addresses user-space emulation. Stuff done in kernel-mode
>> should be handled there.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>   accel/kvm/kvm-all.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 90c88b517d..85bcb2b0d4 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>                             run->io.direction,
>>                             run->io.size,
>>                             run->io.count);
>> -            ret = 0;
>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>               break;
>>           case KVM_EXIT_MMIO:
>>               DPRINTF("handle_mmio\n");
>> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>                                run->mmio.data,
>>                                run->mmio.len,
>>                                run->mmio.is_write);
>> -            ret = 0;
>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>               break;
>>           case KVM_EXIT_IRQ_WINDOW_OPEN:
>>               DPRINTF("irq_window_open\n");
>> --
>> 2.14.1
>>


--
Alex Bennée
Julien Thierry Oct. 4, 2017, 4:10 p.m. UTC | #6
On 04/10/17 16:42, Alex Bennée wrote:
> 
> Julien Thierry <julien.thierry@arm.com> writes:
> 
>> On 04/10/17 11:08, Alex Bennée wrote:
>>>
>>> [Added Paolo, including QEMU userspace patch]
>>>
>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>
>>>> On 03/10/17 18:26, Alex Bennée wrote:
>>>>>
>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>
>>>>>> On 03/10/17 17:30, Alex Bennée wrote:
>>>>>>>
>>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>>
>>>>>>>> On 03/10/17 15:57, Alex Bennée wrote:
>>>>>>>>>
>>>>>>>>> Julien Thierry <julien.thierry@arm.com> writes:
>>>>>>>>>
>>>>>>>>>> On 31/08/17 15:01, Christoffer Dall wrote:
>>>>>>> <snip>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Does that sound like what you had in mind? Or does it seem better than
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> current patch?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I was thinking to change the skip_instruction function to return an
>>>>>>>>>>>>> int, and then call kvm_handle_debug_ss() from skip_instruction, which
>>>>>>>>>>>>> would update the kvm_run structure and exit here and then.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Setting up the debug exception from within kvm_skip_instruction seem to
>>>>>>>>>>>> change a bit too much its semantic from arm to arm64. I would find this
>>>>>>>>>>>> easily confusing.
>>>>>>>>>>>>
>>>>>>>>>>>>> However, I'm now thinking that this doesn't really work either,
>>>>>>>>>>>>> because we could have to emulate a trapped MMIO instruction in user
>>>>>>>>>>>>> space, and then it's not clear how to exit with a debug exception at
>>>>>>>>>>>>> the same time.
>>>>>>>>>
>>>>>>>>> A debug exception at guest exit point is (IIRC) just having the
>>>>>>>>> appropriate status in the run->exit_reason (KVM_EXIT_DEBUG). If you need
>>>>>>>>> to exit for MMIO emulation (i.e. the instruction has not run yet) you
>>>>>>>>> shouldn't do that. Exit, emulate and return. We could handle the ioctl
>>>>>>>>> to clear SS in userspace but I guess that gets just as messy.
>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So perhaps we should stick with your original approach.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I had not realized that was possible. This makes things more complicated for
>>>>>>>>>>>> avoiding a back and forth with the guest for trapped exceptions. Out of
>>>>>>>>>>>> luck, having the debug flag does look like single stepping would work as
>>>>>>>>>>>> expected for userland MMIOs.
>>>>>>>
>>>>>>> <snip>
>>>>
>>>> There is also the case of GIC CPU inteface accesses being trapped
>>>> (which shouldn't be the default behaviour). If the vgic access fails,
>>>> we will skip the instruction (in __kvm_vcpu_run in
>>>> arch/arm64/kvm/hyp/switch.c) and if we were single stepping we will
>>>> single step 2 instructions. But this seems to be a corner case of a
>>>> corner case (GIC CPUIF trapped + access failing + single stepping), so
>>>> I don't know if we want to take that into account right now?
>>>
>>> Yeah looking at the hyp code I did wonder if it warranted the complexity
>>> of adding handling there.
>>>
>>>> I'm still a bit concerned about the change of semantics for
>>>> KVM_EXIT_MMIO with regards to userland (cf. my previous mail). But if
>>>> this is deemed to be a reasonable change, the patch seems fine to me.
>>>
>>> Have we changed the semantics? A normal un-handled by the kernel IO/MMIO
>>> exit needs to be processed before the single step takes effect. I can't
>>> speak for other userspace but I think for QEMU it is as simple as the
>>> patch bellow. That said it wasn't quite clear where the PC gets updated
>>> in this path - I think the kernel updates the PC before the
>>> KVM_EXIT_MMIO in the same path as the internal handling.
>>>
>>
>> Well I'm not sure. The part I am concerned about is:
>>> NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
>>>        KVM_EXIT_EPR the corresponding
>>> operations are complete (and guest state is consistent) only after userspace
>>> has re-entered the kernel with KVM_RUN.  The kernel side will first finish
>>> incomplete operations and then check for pending signals.  Userspace
>>> can re-enter the guest with an unmasked signal pending to complete
>>> pending operations.
>>
>>  From Documentation/virtual/kvm/api.txt.
>>
>> The way I interpret this is that userland should not consider the MMIO
>> complete before running the vcpu again. If that the case it shouldn't
>> trigger the single step since the instruction is not completely
>> finished.
>>
>> Maybe I don't interpret this correctly or it is not relevant here.
>> Although I'd like to understand why.
> 
> No it certainly needs clarifying. The comment was originally added in:
> 
>    679613442f84702c06a80f2320cb8a50089200bc
> 
> Looking more closely though it has a point. The IO/MMIO exits work
> purely from the address and data entries in the run structure. When we
> return to KVM_RUN we do:
> 
> 	if (run->exit_reason == KVM_EXIT_MMIO) {
> 		ret = kvm_handle_mmio_return(vcpu, vcpu->run);
> 		if (ret)
> 			return ret;
> 	}
> 
> So you are correct the instruction emulation is not complete. Once that
> fixup is done however I think we are good to return. So perhaps we can
> avoid involving QEMU entirely in this by generating a debug exit
> here.
> 
> QEMU ->
>          kvm_run ->
>                      switch ->
>                                guest
>                             <-
>                      trap
>                   <-
>          exit mmio
>       <-
> QEMU
>       -> kvm_run
>          handle_mmio_return
>          exit debug
>       <-
> QEMU
> 

Thanks for the explanation. This approach sounds good to me. Also, it 
means QEMU doesn't need to get patched, is that correct?

> I don't think this affects the handle_exit() case as we only force the
> exit for successfully emulated instructions inside kvm.
> 

Yes, in handle_exit MMIO handled by the kernel will exit with 
KVM_EXIT_DEBUG and MMIO handled by the userland will exit with 
KVM_EXIT_MMIO. So from your patch we just need to add the exit debug 
after handle_mmio_return.

> Looking at x86 for reference it does seem happy with triggering exits on
> single stepped emulation, see kvm_vcpu_do_singlestep(). However it also
> has a number of comments on IO emulation:
> 
>    /* FIXME: return into emulator if single-stepping.  */
> 
> So ARM is at least not behind the curve on this support ;-)
> 
>>> I'd like to test these patches on some real life examples. I tried
>>> tracing over the pl011_write code in a kernel boot but I ran into the
>>> problem of el1_irq's occurring as I tried to step through the guest
>>> kernel. Is this something you've come across? What MMIO accesses have
>>> you been using in your testing?
>>>
>>
>> I didn't know which MMIOs were handled by userland so I have only
>> tested traps and MMIOs handled by the kernel.
> 
> Any particular MMIOs I could also use to replicate in my tests?
> 

For MMIOs handled by the kernel, I was testing with the GIC. You could 
try to break on gic_cpu_config in the guest, where it will write to 
distributor registers. The function should be called during 
initialization, before IRQs are enabled, so you shouldn't be bothered by 
the earlier trouble.

>> This sounds like an issue when you are debugging an interruptible
>> context, an issue Pratyush has been looking at [1]. Are you taking a
>> guest interrupt when you try to execute the instruction to be stepped?
>> I don't know what's the status of this patch series. Can you test the
>> userland MMIO in a non-interruptible context?
>>
>> [1]
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/517234.html
> 
> Again looking at x86 it looks like the approach is to suspend IRQs if
> you are single-stepping. I'll have a look at Pratyush's patches.
> 

I think that was the idea with Pratyush's patches as well.

So, I'm happy to replace my patch with yours in this series (unless you 
want to post it separated since it doesn't depend on my patches).

Thank you for looking at this and providing your input.

Cheers,

>>
>>> QEMU Patch bellow:
>>>
>>>   From 2e8fcea695a9eca67fbeb331d3104d1d9e7e337a Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
>>> Date: Wed, 4 Oct 2017 09:49:41 +0000
>>> Subject: [PATCH] kvm: exit run loop after emulating IO when single stepping
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=UTF-8
>>> Content-Transfer-Encoding: 8bit
>>>
>>> If single-stepping is enabled we should exit the run-loop after
>>> emulating the access. Otherwise single-stepping across emulated IO
>>> accesses may skip an instruction.
>>>
>>> This only addresses user-space emulation. Stuff done in kernel-mode
>>> should be handled there.
>>>
>>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>> ---
>>>    accel/kvm/kvm-all.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>>> index 90c88b517d..85bcb2b0d4 100644
>>> --- a/accel/kvm/kvm-all.c
>>> +++ b/accel/kvm/kvm-all.c
>>> @@ -1940,7 +1940,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>>                              run->io.direction,
>>>                              run->io.size,
>>>                              run->io.count);
>>> -            ret = 0;
>>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>>                break;
>>>            case KVM_EXIT_MMIO:
>>>                DPRINTF("handle_mmio\n");
>>> @@ -1950,7 +1950,7 @@ int kvm_cpu_exec(CPUState *cpu)
>>>                                 run->mmio.data,
>>>                                 run->mmio.len,
>>>                                 run->mmio.is_write);
>>> -            ret = 0;
>>> +            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
>>>                break;
>>>            case KVM_EXIT_IRQ_WINDOW_OPEN:
>>>                DPRINTF("irq_window_open\n");
>>> --
>>> 2.14.1
>>>
> 
> 
> --
> Alex Bennée
>
Alex Bennée Oct. 4, 2017, 6:23 p.m. UTC | #7
Julien Thierry <julien.thierry@arm.com> writes:

> On 04/10/17 16:42, Alex Bennée wrote:
>>
<snip>
>>
>> Looking more closely though it has a point. The IO/MMIO exits work
>> purely from the address and data entries in the run structure. When we
>> return to KVM_RUN we do:
>>
>> 	if (run->exit_reason == KVM_EXIT_MMIO) {
>> 		ret = kvm_handle_mmio_return(vcpu, vcpu->run);
>> 		if (ret)
>> 			return ret;
>> 	}
>>
>> So you are correct the instruction emulation is not complete. Once that
>> fixup is done however I think we are good to return. So perhaps we can
>> avoid involving QEMU entirely in this by generating a debug exit
>> here.
>>
>> QEMU ->
>>          kvm_run ->
>>                      switch ->
>>                                guest
>>                             <-
>>                      trap
>>                   <-
>>          exit mmio
>>       <-
>> QEMU
>>       -> kvm_run
>>          handle_mmio_return
>>          exit debug
>>       <-
>> QEMU
>>
>
> Thanks for the explanation. This approach sounds good to me. Also, it
> means QEMU doesn't need to get patched, is that correct?

Correct.

>
>> I don't think this affects the handle_exit() case as we only force the
>> exit for successfully emulated instructions inside kvm.
>>
>
> Yes, in handle_exit MMIO handled by the kernel will exit with
> KVM_EXIT_DEBUG and MMIO handled by the userland will exit with
> KVM_EXIT_MMIO. So from your patch we just need to add the exit debug
> after handle_mmio_return.

One minor wrinkle in kvm_handle_mmio_return() I can't use
vcpu->debug_flags as the v7 code doesn't have it in kvm_vcpu_arch.

>
>> Looking at x86 for reference it does seem happy with triggering exits on
>> single stepped emulation, see kvm_vcpu_do_singlestep(). However it also
>> has a number of comments on IO emulation:
>>
>>    /* FIXME: return into emulator if single-stepping.  */
>>
>> So ARM is at least not behind the curve on this support ;-)
>>
>>>> I'd like to test these patches on some real life examples. I tried
>>>> tracing over the pl011_write code in a kernel boot but I ran into the
>>>> problem of el1_irq's occurring as I tried to step through the guest
>>>> kernel. Is this something you've come across? What MMIO accesses have
>>>> you been using in your testing?
>>>>
>>>
>>> I didn't know which MMIOs were handled by userland so I have only
>>> tested traps and MMIOs handled by the kernel.
>>
>> Any particular MMIOs I could also use to replicate in my tests?
>>
>
> For MMIOs handled by the kernel, I was testing with the GIC. You could
> try to break on gic_cpu_config in the guest, where it will write to
> distributor registers. The function should be called during
> initialization, before IRQs are enabled, so you shouldn't be bothered
> by the earlier trouble.

Thanks - this will be useful.

>>> This sounds like an issue when you are debugging an interruptible
>>> context, an issue Pratyush has been looking at [1]. Are you taking a
>>> guest interrupt when you try to execute the instruction to be stepped?
>>> I don't know what's the status of this patch series. Can you test the
>>> userland MMIO in a non-interruptible context?
>>>
>>> [1]
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/517234.html
>>
>> Again looking at x86 it looks like the approach is to suspend IRQs if
>> you are single-stepping. I'll have a look at Pratyush's patches.
>>
>
> I think that was the idea with Pratyush's patches as well.
>
> So, I'm happy to replace my patch with yours in this series (unless
> you want to post it separated since it doesn't depend on my patches).


Nope I'm happy for you to carry it to merge. I'll see if I can send you
a v2 once I've addressed the mmio completion.

>
> Thank you for looking at this and providing your input.

No problem, sorry it took so long.

--
Alex Bennée
diff mbox

Patch

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 90c88b517d..85bcb2b0d4 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1940,7 +1940,7 @@  int kvm_cpu_exec(CPUState *cpu)
                           run->io.direction,
                           run->io.size,
                           run->io.count);
-            ret = 0;
+            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
@@ -1950,7 +1950,7 @@  int kvm_cpu_exec(CPUState *cpu)
                              run->mmio.data,
                              run->mmio.len,
                              run->mmio.is_write);
-            ret = 0;
+            ret = cpu->singlestep_enabled ? EXCP_DEBUG : 0;
             break;
         case KVM_EXIT_IRQ_WINDOW_OPEN:
             DPRINTF("irq_window_open\n");