diff mbox series

[XEN,v8,2/2] xen/arm64: io: Support instructions (for which ISS is not valid) on emulated MMIO region using MMIO/ioreq handler

Message ID 20220212233433.46018-3-ayankuma@xilinx.com (mailing list archive)
State Superseded
Headers show
Series xen/arm64: io: Decode ldr/str post-indexing instruction | expand

Commit Message

Ayan Kumar Halder Feb. 12, 2022, 11:34 p.m. UTC
When an instruction is trapped in Xen due to translation fault, Xen
checks if the ISS is valid. If not, Xen tries to resolve the translation
fault using p2m page tables. In case if it is a data abort, Xen will
try to map the mmio region to the guest (ie tries to emulate the mmio
region).

If the ISS is not valid and it is a data abort, then Xen tries to
decode the instruction. In case of ioreq, Xen  saves the decoding state,
rn and imm9 to vcpu_io. Whenever the vcpu handles the ioreq successfully,
it will read the decoding state to determine if the instruction decoded
was a ldr/str post indexing (ie INSTR_LDR_STR_POSTINDEXING). If so, it
uses these details to post increment rn.

In case of mmio handler, if the mmio operation was successful, then Xen
retrives the decoding state, rn and imm9. For state ==
INSTR_LDR_STR_POSTINDEXING, Xen will update rn.

If there is an error encountered while decoding the instruction, Xen
will issue an abort to the guest. If the instruction was trapped due
to stage1 page translation table walk, Xen will update the page tables
(under the assumption that they are in non-MMIO region) and will return
to the guest so that it can retry the instruction. To handle all these
different states, we have introduced 'enum instr_decode_state'.

Signed-off-by: Ayan Kumar Halder <ayankuma@xilinx.com>
---

Changelog :-

v2..v5 - Mentioned in the cover letter.

v6 - 1. Mantained the decoding state of the instruction. This is used by the
caller to either abort the guest or retry or ignore or perform read/write on
the mmio region.

2. try_decode() invokes decoding for both aarch64 and thumb state. (Previously
it used to invoke decoding only for aarch64 state). Thus, it handles all the
checking of the registers before invoking any decoding of instruction.
try_decode_instruction_invalid_iss() has thus been removed.

3. Introduced a new field('enum instr_decode_state state') inside
'struct instr_details'. This holds the decoding state of the instruction.
This is later read by the post_increment_register() to determine if rn needs to
be incremented. Also, this is read by the callers of try_decode_instruction()
to determine if the instruction was valid or ignored or to be retried or
error or decoded successfully.

4. Also stored 'instr_details' inside 'struct ioreq'. This enables
arch_ioreq_complete_mmio() to invoke post_increment_register() without decoding
the instruction again.

5. Check hsr.dabt.valid in do_trap_stage2_abort_guest(). If it is not valid,
then decode the instruction. This ensures that try_handle_mmio() is invoked only
when the instruction is either valid or decoded successfully.

6. Inside do_trap_stage2_abort_guest(), if hsr.dabt.valid is not set, then
resolve the translation fault before trying to decode the instruction. If
translation fault is resolved, then return to the guest to execute the instruction
again.


v7 - 1. Moved the decoding instruction details ie instr_details from 'struct ioreq'
to 'struct vcpu_io'.

2. The instruction is decoded only when we get a data abort.

3. Replaced ASSERT_UNREACHABLE() with domain_crash(). The reason being asserts
can be disabled in some builds. In this scenario when the guest's cpsr is in an
erroneous state, Xen should crash the guest.

4. Introduced check_p2m() which invokes p2m_resolve_translation_fault() and
try_map_mmio() to resolve translation fault by configuring the page tables. This
gets invoked first if ISS is invalid and it is an instruction abort. If it is
a data abort and hsr.dabt.s1ptw is set or try_handle_mmio() returns IO_UNHANDLED,
then check_p2m() gets invoked again.


v8 - 1. Removed the handling of data abort when info->dabt.cache is set. This will
be implemented in a subsequent patch. (Not as part of this series)

2. When the data abort is due to access to stage 1 translation tables, Xen will
try to fix the mapping of the page table for the corresponding address. If this
returns an error, Xen will abort the guest. Else, it will ask the guest to retry
the instruction.

3. Changed v->io.info.dabt_instr from pointer to variable. The reason being that
arch_ioreq_complete_mmio() is called from leave_hypervisor_to_guest().
That is after do_trap_stage2_abort_guest()  has been invoked. So the original
variable will be no longer valid.

4. Some other style issues pointed out in v7.


 xen/arch/arm/arm32/traps.c        |  7 +++
 xen/arch/arm/arm64/traps.c        | 47 +++++++++++++++
 xen/arch/arm/decode.c             |  1 +
 xen/arch/arm/include/asm/domain.h |  4 ++
 xen/arch/arm/include/asm/mmio.h   | 15 ++++-
 xen/arch/arm/include/asm/traps.h  |  2 +
 xen/arch/arm/io.c                 | 98 ++++++++++++++++++++-----------
 xen/arch/arm/ioreq.c              |  7 ++-
 xen/arch/arm/traps.c              | 80 +++++++++++++++++++++----
 xen/arch/x86/include/asm/ioreq.h  |  3 +
 xen/include/xen/sched.h           |  2 +
 11 files changed, 217 insertions(+), 49 deletions(-)

Comments

Julien Grall Feb. 13, 2022, 12:19 p.m. UTC | #1
Hi,

On 12/02/2022 23:34, Ayan Kumar Halder wrote:
> 
>   xen/arch/arm/arm32/traps.c        |  7 +++
>   xen/arch/arm/arm64/traps.c        | 47 +++++++++++++++
>   xen/arch/arm/decode.c             |  1 +
>   xen/arch/arm/include/asm/domain.h |  4 ++
>   xen/arch/arm/include/asm/mmio.h   | 15 ++++-
>   xen/arch/arm/include/asm/traps.h  |  2 +
>   xen/arch/arm/io.c                 | 98 ++++++++++++++++++++-----------
>   xen/arch/arm/ioreq.c              |  7 ++-
>   xen/arch/arm/traps.c              | 80 +++++++++++++++++++++----
>   xen/arch/x86/include/asm/ioreq.h  |  3 +

This change technically needs an ack from the x86 maintainers. And...

>   xen/include/xen/sched.h           |  2 +

this one for anyone from THE REST (Stefano and I are part of it). Please 
use scripts/add_maintainers.pl to automatically add the relevant 
maintainers in CC.

>   11 files changed, 217 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/traps.c b/xen/arch/arm/arm32/traps.c
> index 9c9790a6d1..70c6238196 100644
> --- a/xen/arch/arm/arm32/traps.c
> +++ b/xen/arch/arm/arm32/traps.c
> @@ -18,9 +18,11 @@
>   
>   #include <xen/lib.h>
>   #include <xen/kernel.h>
> +#include <xen/sched.h>
>   
>   #include <public/xen.h>
>   
> +#include <asm/mmio.h>
>   #include <asm/processor.h>
>   #include <asm/traps.h>
>   
> @@ -82,6 +84,11 @@ void do_trap_data_abort(struct cpu_user_regs *regs)
>           do_unexpected_trap("Data Abort", regs);
>   }
>   
> +void post_increment_register(const struct instr_details *instr)
> +{
> +    domain_crash(current->domain);


Please add a comment explaning why this is resulting to a domain crash. 
AFAICT, this is because this should not be reachable (yet) for 32-bit.


> +}
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/arch/arm/arm64/traps.c b/xen/arch/arm/arm64/traps.c
> index 9113a15c7a..a6766689b3 100644
> --- a/xen/arch/arm/arm64/traps.c
> +++ b/xen/arch/arm/arm64/traps.c
> @@ -23,6 +23,7 @@
>   #include <asm/processor.h>
>   
>   #include <public/xen.h>
> +#include <xen/sched.h>

The headers should ordered so <xen/*.h> are first, then <asm/*.h>, then 
<public/*.h>. They should then be ordered alphabetically within each of 
the category.

So, this new header should be included right after <xen/lib.h>

[...]

> diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h
> index 3354d9c635..745130b7fe 100644
> --- a/xen/arch/arm/include/asm/mmio.h
> +++ b/xen/arch/arm/include/asm/mmio.h
> @@ -26,12 +26,22 @@
>   
>   #define MAX_IO_HANDLER  16
>   
> +enum instr_decode_state
> +{
> +    INSTR_ERROR,                    /* Error encountered while decoding instr */
> +    INSTR_VALID,                    /* ISS is valid, so no need to decode */
> +    INSTR_LDR_STR_POSTINDEXING,     /* Instruction is decoded successfully.
> +                                       It is ldr/str post indexing */

Coding style: multiple-line comments for Xen should be:

/*
  * ...
  * ...
  */

In this case, I would simply move the comment on top.

[...]

> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
> index a289d393f9..203466b869 100644
> --- a/xen/arch/arm/io.c
> +++ b/xen/arch/arm/io.c
> @@ -95,57 +95,87 @@ static const struct mmio_handler *find_mmio_handler(struct domain *d,
>       return handler;
>   }
>   
> +void try_decode_instruction(const struct cpu_user_regs *regs,
> +                            mmio_info_t *info)
> +{
> +    int rc;
> +
> +    /*
> +     * Erratum 766422: Thumb store translation fault to Hypervisor may
> +     * not have correct HSR Rt value.
> +     */
> +    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
> +         info->dabt.write )
> +    {
> +        rc = decode_instruction(regs, info);
> +        if ( rc )
> +        {
> +            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
> +            info->dabt_instr.state = INSTR_ERROR;
> +            return;
> +        }
> +    }

At the moment, the errata would only be handled when the ISS is valid. 
Now, you are moving it before we know if it is valid. Can you explain why?

[...]

>   enum io_state try_handle_mmio(struct cpu_user_regs *regs,
> -                              const union hsr hsr,
> -                              paddr_t gpa)
> +                              mmio_info_t *info)
>   {
>       struct vcpu *v = current;
>       const struct mmio_handler *handler = NULL;
> -    const struct hsr_dabt dabt = hsr.dabt;
> -    mmio_info_t info = {
> -        .gpa = gpa,
> -        .dabt = dabt
> -    };
> +    int rc;
>   
> -    ASSERT(hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
> +    ASSERT(info->dabt.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>   
> -    handler = find_mmio_handler(v->domain, info.gpa);
> +    handler = find_mmio_handler(v->domain, info->gpa);
>       if ( !handler )
>       {
> -        int rc;
> -
> -        rc = try_fwd_ioserv(regs, v, &info);
> +        rc = try_fwd_ioserv(regs, v, info);
>           if ( rc == IO_HANDLED )
>               return handle_ioserv(regs, v);
>   
>           return rc;
>       }
>   
> -    /* All the instructions used on emulated MMIO region should be valid */
> -    if ( !dabt.valid )
> -        return IO_ABORT;
> -

AFAIU, the assumption is now try_handle_mmio() and try_fwd_ioserv() will 
always be called when dabt.valid == 1. I think it would still be good to 
check that assumption.

So I would move the check at the beginning of try_handle_mmio() and add 
an ASSERT_UNREACHABLE in the if(). Something like:

if ( !dabt.valid )
{
     ASSERT_UNREACHABLE();
     return IO_ABORT;
}

>       /*
> -     * Erratum 766422: Thumb store translation fault to Hypervisor may
> -     * not have correct HSR Rt value.
> +     * At this point, we know that the instruction is either valid or has been
> +     * decoded successfully. Thus, Xen should be allowed to execute the
> +     * instruction on the emulated MMIO region.
>        */
> -    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
> -         dabt.write )
> -    {
> -        int rc;
> -
> -        rc = decode_instruction(regs, &info);
> -        if ( rc )
> -        {
> -            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
> -            return IO_ABORT;
> -        }
> -    }
> -
> -    if ( info.dabt.write )
> -        return handle_write(handler, v, &info);
> +    if ( info->dabt.write )
> +        rc = handle_write(handler, v, info);
>       else
> -        return handle_read(handler, v, &info);
> +        rc = handle_read(handler, v, info);
> +
> +    return rc;

It looks like there are some left-over of the previous approach. It is 
fine to return directly from each branch.


>   }
>   
>   void register_mmio_handler(struct domain *d,
> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
> index 308650b400..3c0a935ccf 100644
> --- a/xen/arch/arm/ioreq.c
> +++ b/xen/arch/arm/ioreq.c
> @@ -47,6 +47,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>                                struct vcpu *v, mmio_info_t *info)
>   {
>       struct vcpu_io *vio = &v->io;
> +    struct dabt_instr instr = info->dabt_instr;
>       ioreq_t p = {
>           .type = IOREQ_TYPE_COPY,
>           .addr = info->gpa,
> @@ -76,10 +77,8 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>       if ( !s )
>           return IO_UNHANDLED;
>   
> -    if ( !info->dabt.valid )
> -        return IO_ABORT;
> -

For this one, I would switch to ASSERT(dabt.valid);

>       vio->req = p;
> +    vio->info.dabt_instr = instr;
>   
>       rc = ioreq_send(s, &p, 0);
>       if ( rc != IO_RETRY || v->domain->is_shutting_down )
> @@ -95,6 +94,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
>   bool arch_ioreq_complete_mmio(void)
>   {
>       struct vcpu *v = current;
> +    struct instr_details dabt_instr = v->io.info.dabt_instr;
>       struct cpu_user_regs *regs = guest_cpu_user_regs();
>       const union hsr hsr = { .bits = regs->hsr };
>   
> @@ -106,6 +106,7 @@ bool arch_ioreq_complete_mmio(void)
>   
>       if ( handle_ioserv(regs, v) == IO_HANDLED )
>       {
> +        post_increment_register(&dabt_instr);
>           advance_pc(regs, hsr);
>           return true;
>       }
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 9339d12f58..455e51cdbe 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1893,6 +1893,21 @@ static bool try_map_mmio(gfn_t gfn)
>       return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
>   }
>   
> +static inline bool check_p2m(bool is_data, paddr_t gpa)
> +{
> +    /*
> +     * First check if the translation fault can be resolved by the P2M subsystem.
> +     * If that's the case nothing else to do.
> +     */
> +    if ( p2m_resolve_translation_fault(current->domain,gaddr_to_gfn(gpa)) )

Coding style: missing space before and after the comma.

> +        return true;
> +
> +    if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
> +        return true;
> +
> +    return false;
> +}
> +
>   static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>                                          const union hsr hsr)
>   {
> @@ -1906,6 +1921,7 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>       paddr_t gpa;
>       uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
>       bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
> +    mmio_info_t info;
>   
>       /*
>        * If this bit has been set, it means that this stage-2 abort is caused
> @@ -1959,6 +1975,25 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>           return;
>       }
>       case FSC_FLT_TRANS:
> +    {
> +        info.gpa = gpa;
> +        info.dabt = hsr.dabt;
> +
> +        /* Check that the ISS is invalid and it is not data abort. */

This comment looks a bit pointless. You are writing literally what the 
check is doing. But you don't really explain why. I think you want to 
move some of the commint with the if here.

However,...

> +        if ( !hsr.dabt.valid && !is_data )

... this code can be reached by Instruction Abort and Data Abort. So you 
can't use hsr.dabt. Instead, you should use xabt (or check is_data first).

If you use xabt, you will notice that the 'valid' bit is not existent
because the instruction syndrome only exists for data abort.

But then, I don't understand why this is only restricted to instruction 
abort. As I wrote in the previous versions and on IRC, there are valid 
use cases to trap a data abort with invalid syndrome. Below...


> +        {
> +
> +            /*
> +             * Assumption :- Most of the times when we get a translation fault
> +             * and the ISS is invalid, the underlying cause is that the page
> +             * tables have not been set up correctly.
> +             */
> +            if ( check_p2m(is_data, gpa) )
> +                return;
> +            else
> +                goto inject_abt;
> +        }
> +
>           /*
>            * Attempt first to emulate the MMIO as the data abort will
>            * likely happen in an emulated region.
> @@ -1967,13 +2002,45 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>            */
>           if ( is_data )
>           {
> -            enum io_state state = try_handle_mmio(regs, hsr, gpa);
> +            enum io_state state;
> +
> +            try_decode_instruction(regs, &info);
> +
> +            /*
> +             * If Xen could not decode the instruction for any reason, then it
> +             * should ask the caller to abort the guest.
> +             */
> +            if ( info.dabt_instr.state == INSTR_ERROR )
> +                goto inject_abt;

... this will inject a data abort to the guest when we can't decode. 
This is not what we want. We should check whether this is a P2M 
translation fault or we need to map an MMIO region.

In pseudo-code, this would look like:

if ( !is_data || hsr.dabt.valid )
{
     if ( check_p2m() )
       return;


     if ( !is_data )
        goto inject_dabt;

     decode_instruction();
     if ( !dabt.invalid )
       goto inject_dabt;
}

try_handle_mmio();

if ( instruction was not decoded )
   check_p2m();

Cheers,
Ayan Kumar Halder Feb. 21, 2022, 5:05 p.m. UTC | #2
Hi Julien,

Apprecciate your continuous help on this. Most of your comments make 
sense, but I need few clarifications as below :-

On 13/02/2022 12:19, Julien Grall wrote:
> Hi,
>
> On 12/02/2022 23:34, Ayan Kumar Halder wrote:
>>
>>   xen/arch/arm/arm32/traps.c        |  7 +++
>>   xen/arch/arm/arm64/traps.c        | 47 +++++++++++++++
>>   xen/arch/arm/decode.c             |  1 +
>>   xen/arch/arm/include/asm/domain.h |  4 ++
>>   xen/arch/arm/include/asm/mmio.h   | 15 ++++-
>>   xen/arch/arm/include/asm/traps.h  |  2 +
>>   xen/arch/arm/io.c                 | 98 ++++++++++++++++++++-----------
>>   xen/arch/arm/ioreq.c              |  7 ++-
>>   xen/arch/arm/traps.c              | 80 +++++++++++++++++++++----
>>   xen/arch/x86/include/asm/ioreq.h  |  3 +
>
> This change technically needs an ack from the x86 maintainers. And...
>
>>   xen/include/xen/sched.h           |  2 +
>
> this one for anyone from THE REST (Stefano and I are part of it). 
> Please use scripts/add_maintainers.pl to automatically add the 
> relevant maintainers in CC.
>
>>   11 files changed, 217 insertions(+), 49 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm32/traps.c b/xen/arch/arm/arm32/traps.c
>> index 9c9790a6d1..70c6238196 100644
>> --- a/xen/arch/arm/arm32/traps.c
>> +++ b/xen/arch/arm/arm32/traps.c
>> @@ -18,9 +18,11 @@
>>     #include <xen/lib.h>
>>   #include <xen/kernel.h>
>> +#include <xen/sched.h>
>>     #include <public/xen.h>
>>   +#include <asm/mmio.h>
>>   #include <asm/processor.h>
>>   #include <asm/traps.h>
>>   @@ -82,6 +84,11 @@ void do_trap_data_abort(struct cpu_user_regs *regs)
>>           do_unexpected_trap("Data Abort", regs);
>>   }
>>   +void post_increment_register(const struct instr_details *instr)
>> +{
>> +    domain_crash(current->domain);
>
>
> Please add a comment explaning why this is resulting to a domain 
> crash. AFAICT, this is because this should not be reachable (yet) for 
> 32-bit.
>
>
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/xen/arch/arm/arm64/traps.c b/xen/arch/arm/arm64/traps.c
>> index 9113a15c7a..a6766689b3 100644
>> --- a/xen/arch/arm/arm64/traps.c
>> +++ b/xen/arch/arm/arm64/traps.c
>> @@ -23,6 +23,7 @@
>>   #include <asm/processor.h>
>>     #include <public/xen.h>
>> +#include <xen/sched.h>
>
> The headers should ordered so <xen/*.h> are first, then <asm/*.h>, 
> then <public/*.h>. They should then be ordered alphabetically within 
> each of the category.
>
> So, this new header should be included right after <xen/lib.h>
>
> [...]
>
>> diff --git a/xen/arch/arm/include/asm/mmio.h 
>> b/xen/arch/arm/include/asm/mmio.h
>> index 3354d9c635..745130b7fe 100644
>> --- a/xen/arch/arm/include/asm/mmio.h
>> +++ b/xen/arch/arm/include/asm/mmio.h
>> @@ -26,12 +26,22 @@
>>     #define MAX_IO_HANDLER  16
>>   +enum instr_decode_state
>> +{
>> +    INSTR_ERROR,                    /* Error encountered while 
>> decoding instr */
>> +    INSTR_VALID,                    /* ISS is valid, so no need to 
>> decode */
>> +    INSTR_LDR_STR_POSTINDEXING,     /* Instruction is decoded 
>> successfully.
>> +                                       It is ldr/str post indexing */
>
> Coding style: multiple-line comments for Xen should be:
>
> /*
>  * ...
>  * ...
>  */
>
> In this case, I would simply move the comment on top.
>
> [...]
>
>> diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
>> index a289d393f9..203466b869 100644
>> --- a/xen/arch/arm/io.c
>> +++ b/xen/arch/arm/io.c
>> @@ -95,57 +95,87 @@ static const struct mmio_handler 
>> *find_mmio_handler(struct domain *d,
>>       return handler;
>>   }
>>   +void try_decode_instruction(const struct cpu_user_regs *regs,
>> +                            mmio_info_t *info)
>> +{
>> +    int rc;
>> +
>> +    /*
>> +     * Erratum 766422: Thumb store translation fault to Hypervisor may
>> +     * not have correct HSR Rt value.
>> +     */
>> +    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
>> +         info->dabt.write )
>> +    {
>> +        rc = decode_instruction(regs, info);
>> +        if ( rc )
>> +        {
>> +            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
>> +            info->dabt_instr.state = INSTR_ERROR;
>> +            return;
>> +        }
>> +    }
>
> At the moment, the errata would only be handled when the ISS is valid. 
> Now, you are moving it before we know if it is valid. Can you explain 
> why?
>
> [...]
>
>>   enum io_state try_handle_mmio(struct cpu_user_regs *regs,
>> -                              const union hsr hsr,
>> -                              paddr_t gpa)
>> +                              mmio_info_t *info)
>>   {
>>       struct vcpu *v = current;
>>       const struct mmio_handler *handler = NULL;
>> -    const struct hsr_dabt dabt = hsr.dabt;
>> -    mmio_info_t info = {
>> -        .gpa = gpa,
>> -        .dabt = dabt
>> -    };
>> +    int rc;
>>   -    ASSERT(hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>> +    ASSERT(info->dabt.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>>   -    handler = find_mmio_handler(v->domain, info.gpa);
>> +    handler = find_mmio_handler(v->domain, info->gpa);
>>       if ( !handler )
>>       {
>> -        int rc;
>> -
>> -        rc = try_fwd_ioserv(regs, v, &info);
>> +        rc = try_fwd_ioserv(regs, v, info);
>>           if ( rc == IO_HANDLED )
>>               return handle_ioserv(regs, v);
>>             return rc;
>>       }
>>   -    /* All the instructions used on emulated MMIO region should be 
>> valid */
>> -    if ( !dabt.valid )
>> -        return IO_ABORT;
>> -
>
> AFAIU, the assumption is now try_handle_mmio() and try_fwd_ioserv() 
> will always be called when dabt.valid == 1. I think it would still be 
> good to check that assumption.
>
> So I would move the check at the beginning of try_handle_mmio() and 
> add an ASSERT_UNREACHABLE in the if(). Something like:
>
> if ( !dabt.valid )
> {
>     ASSERT_UNREACHABLE();
>     return IO_ABORT;
> }
>
>>       /*
>> -     * Erratum 766422: Thumb store translation fault to Hypervisor may
>> -     * not have correct HSR Rt value.
>> +     * At this point, we know that the instruction is either valid 
>> or has been
>> +     * decoded successfully. Thus, Xen should be allowed to execute the
>> +     * instruction on the emulated MMIO region.
>>        */
>> -    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
>> -         dabt.write )
>> -    {
>> -        int rc;
>> -
>> -        rc = decode_instruction(regs, &info);
>> -        if ( rc )
>> -        {
>> -            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
>> -            return IO_ABORT;
>> -        }
>> -    }
>> -
>> -    if ( info.dabt.write )
>> -        return handle_write(handler, v, &info);
>> +    if ( info->dabt.write )
>> +        rc = handle_write(handler, v, info);
>>       else
>> -        return handle_read(handler, v, &info);
>> +        rc = handle_read(handler, v, info);
>> +
>> +    return rc;
>
> It looks like there are some left-over of the previous approach. It is 
> fine to return directly from each branch.
>
>
>>   }
>>     void register_mmio_handler(struct domain *d,
>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>> index 308650b400..3c0a935ccf 100644
>> --- a/xen/arch/arm/ioreq.c
>> +++ b/xen/arch/arm/ioreq.c
>> @@ -47,6 +47,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>> *regs,
>>                                struct vcpu *v, mmio_info_t *info)
>>   {
>>       struct vcpu_io *vio = &v->io;
>> +    struct dabt_instr instr = info->dabt_instr;
>>       ioreq_t p = {
>>           .type = IOREQ_TYPE_COPY,
>>           .addr = info->gpa,
>> @@ -76,10 +77,8 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>> *regs,
>>       if ( !s )
>>           return IO_UNHANDLED;
>>   -    if ( !info->dabt.valid )
>> -        return IO_ABORT;
>> -
>
> For this one, I would switch to ASSERT(dabt.valid);
I see that try_fwd_ioserv() is invoked from try_handle_mmio() only. 
Thus, if I follow your suggestion of adding a check for dabt.valid at 
the beginning of try_handle_mmio(), then this ASSERT() is not required. 
Let me know if you agree ?
>
>>       vio->req = p;
>> +    vio->info.dabt_instr = instr;
>>         rc = ioreq_send(s, &p, 0);
>>       if ( rc != IO_RETRY || v->domain->is_shutting_down )
>> @@ -95,6 +94,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>> *regs,
>>   bool arch_ioreq_complete_mmio(void)
>>   {
>>       struct vcpu *v = current;
>> +    struct instr_details dabt_instr = v->io.info.dabt_instr;
>>       struct cpu_user_regs *regs = guest_cpu_user_regs();
>>       const union hsr hsr = { .bits = regs->hsr };
>>   @@ -106,6 +106,7 @@ bool arch_ioreq_complete_mmio(void)
>>         if ( handle_ioserv(regs, v) == IO_HANDLED )
>>       {
>> +        post_increment_register(&dabt_instr);
>>           advance_pc(regs, hsr);
>>           return true;
>>       }
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 9339d12f58..455e51cdbe 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -1893,6 +1893,21 @@ static bool try_map_mmio(gfn_t gfn)
>>       return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
>>   }
>>   +static inline bool check_p2m(bool is_data, paddr_t gpa)
>> +{
>> +    /*
>> +     * First check if the translation fault can be resolved by the 
>> P2M subsystem.
>> +     * If that's the case nothing else to do.
>> +     */
>> +    if ( 
>> p2m_resolve_translation_fault(current->domain,gaddr_to_gfn(gpa)) )
>
> Coding style: missing space before and after the comma.
>
>> +        return true;
>> +
>> +    if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
>> +        return true;
>> +
>> +    return false;
>> +}
>> +
>>   static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>>                                          const union hsr hsr)
>>   {
>> @@ -1906,6 +1921,7 @@ static void do_trap_stage2_abort_guest(struct 
>> cpu_user_regs *regs,
>>       paddr_t gpa;
>>       uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
>>       bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>> +    mmio_info_t info;
>>         /*
>>        * If this bit has been set, it means that this stage-2 abort 
>> is caused
>> @@ -1959,6 +1975,25 @@ static void do_trap_stage2_abort_guest(struct 
>> cpu_user_regs *regs,
>>           return;
>>       }
>>       case FSC_FLT_TRANS:
>> +    {
>> +        info.gpa = gpa;
>> +        info.dabt = hsr.dabt;
>> +
>> +        /* Check that the ISS is invalid and it is not data abort. */
>
> This comment looks a bit pointless. You are writing literally what the 
> check is doing. But you don't really explain why. I think you want to 
> move some of the commint with the if here.
>
> However,...
>
>> +        if ( !hsr.dabt.valid && !is_data )
>
> ... this code can be reached by Instruction Abort and Data Abort. So 
> you can't use hsr.dabt. Instead, you should use xabt (or check is_data 
> first).
>
> If you use xabt, you will notice that the 'valid' bit is not existent
> because the instruction syndrome only exists for data abort.
>
> But then, I don't understand why this is only restricted to 
> instruction abort. As I wrote in the previous versions and on IRC, 
> there are valid use cases to trap a data abort with invalid syndrome. 
> Below...
>
>
>> +        {
>> +
>> +            /*
>> +             * Assumption :- Most of the times when we get a 
>> translation fault
>> +             * and the ISS is invalid, the underlying cause is that 
>> the page
>> +             * tables have not been set up correctly.
>> +             */
>> +            if ( check_p2m(is_data, gpa) )
>> +                return;
>> +            else
>> +                goto inject_abt;
>> +        }
>> +
>>           /*
>>            * Attempt first to emulate the MMIO as the data abort will
>>            * likely happen in an emulated region.
>> @@ -1967,13 +2002,45 @@ static void do_trap_stage2_abort_guest(struct 
>> cpu_user_regs *regs,
>>            */
>>           if ( is_data )
>>           {
>> -            enum io_state state = try_handle_mmio(regs, hsr, gpa);
>> +            enum io_state state;
>> +
>> +            try_decode_instruction(regs, &info);
>> +
>> +            /*
>> +             * If Xen could not decode the instruction for any 
>> reason, then it
>> +             * should ask the caller to abort the guest.
>> +             */
>> +            if ( info.dabt_instr.state == INSTR_ERROR )
>> +                goto inject_abt;
>
> ... this will inject a data abort to the guest when we can't decode. 
> This is not what we want. We should check whether this is a P2M 
> translation fault or we need to map an MMIO region.
>
> In pseudo-code, this would look like:
>
> if ( !is_data || hsr.dabt.valid )

I think you mean if ( !is_data || !hsr.dabt.valid )

The reason being if there is an instruction abort or a data abort (with 
ISV == 0), then it should try to configure the page tables.

> {
>     if ( check_p2m() )
>       return;
>
>
>     if ( !is_data )
>        goto inject_dabt;
>
>     decode_instruction();
>     if ( !dabt.invalid )
>       goto inject_dabt;
> }
>
> try_handle_mmio();
>
> if ( instruction was not decoded )
>   check_p2m();

If the instruction was not decoded, then there is no need to configure 
the page tables again. We have already done this before. So, it should 
be safe to abort the guest ?


So my understanding is as follows :-

         /* Check that it is instruction abort or ISS is invalid. */
         if ( !is_data || !info.dabt.valid )
         {
             /*
              * If the instruction was trapped due to access to stage 1 
translation
              * then Xen should try to resolve the page table entry for 
the stage 1
              * translation table with the assumption that the page 
tables are
              * present in the non MMIO region. If it is successful, 
then it should
              * ask the guest to retry the instruction.
              */
             if ( is_data && info.dabt.s1ptw )
             {
                 info.dabt_instr.state = INSTR_RETRY;
                 /* The translation tables are assumed to be in non MMIO 
region. */
                 is_data = false;
             }

             /*
              * Assumption :- Most of the times when we get a 
translation fault
              * and the ISS is invalid, the underlying cause is that the 
page
              * tables have not been set up correctly.
              */
             if ( check_p2m(is_data, gpa) )
                 return;

             /*
              * If the instruction abort or the data abort due to access 
to stage 1
              * translation tables could not be resolved by setting the 
appropriate
              * bits in the translation table, then Xen should abort the 
guest.
              */
             if ( !is_data || (info.dabt_instr.state == INSTR_RETRY) )
                 goto inject_abt;

             try_decode_instruction(regs, &info);

             /* Instruction could not be decoded, then abort the guest */

             if ( info.dabt_instr.state == INSTR_ERROR)
                 goto inject_abt;
         }

         state = try_handle_mmio(regs, &info);

         switch ( state )
         {
             case IO_ABORT:
                 goto inject_abt;
             case IO_HANDLED:
                 /*
                  * If the instruction was decoded and has executed 
successfully
                  * on the MMIO region, then Xen should execute the next 
part of
                  * the instruction. (for eg increment the rn if it is a
                  * post-indexing instruction.
                  */
                 post_increment_register(&info.dabt_instr);
                 advance_pc(regs, hsr);
                 return;
             case IO_RETRY:
                 /* finish later */
                 return;
             case IO_UNHANDLED:
                 /* IO unhandled, try another way to handle it. */
                 break;
         }

         if ( check_p2m(is_data, gpa) )
             return;


Please letme know if I am misunderstanding something.

- Ayan

>
> Cheers,
>
Julien Grall Feb. 21, 2022, 5:57 p.m. UTC | #3
On 21/02/2022 17:05, Ayan Kumar Halder wrote:
> Hi Julien,

Hi,

> On 13/02/2022 12:19, Julien Grall wrote:
>>>   }
>>>     void register_mmio_handler(struct domain *d,
>>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>>> index 308650b400..3c0a935ccf 100644
>>> --- a/xen/arch/arm/ioreq.c
>>> +++ b/xen/arch/arm/ioreq.c
>>> @@ -47,6 +47,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>>> *regs,
>>>                                struct vcpu *v, mmio_info_t *info)
>>>   {
>>>       struct vcpu_io *vio = &v->io;
>>> +    struct dabt_instr instr = info->dabt_instr;
>>>       ioreq_t p = {
>>>           .type = IOREQ_TYPE_COPY,
>>>           .addr = info->gpa,
>>> @@ -76,10 +77,8 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>>> *regs,
>>>       if ( !s )
>>>           return IO_UNHANDLED;
>>>   -    if ( !info->dabt.valid )
>>> -        return IO_ABORT;
>>> -
>>
>> For this one, I would switch to ASSERT(dabt.valid);
> I see that try_fwd_ioserv() is invoked from try_handle_mmio() only. 
> Thus, if I follow your suggestion of adding a check for dabt.valid at 
> the beginning of try_handle_mmio(), then this ASSERT() is not required.

I agree that try_handle_mmio() is the only caller today. But we don't 
know how this is going to be used tomorrow.

The goal of this ASSERT() is to catch those new users that would call it 
wrongly.

[...]

>> ... this will inject a data abort to the guest when we can't decode. 
>> This is not what we want. We should check whether this is a P2M 
>> translation fault or we need to map an MMIO region.
>>
>> In pseudo-code, this would look like:
>>
>> if ( !is_data || hsr.dabt.valid )
> 
> I think you mean if ( !is_data || !hsr.dabt.valid )

You are right.

> 
> The reason being if there is an instruction abort or a data abort (with 
> ISV == 0), then it should try to configure the page tables.
> 
>> {
>>     if ( check_p2m() )
>>       return;
>>
>>
>>     if ( !is_data )
>>        goto inject_dabt;
>>
>>     decode_instruction();
>>     if ( !dabt.invalid )
>>       goto inject_dabt;
>> }
>>
>> try_handle_mmio();
>>
>> if ( instruction was not decoded )
>>   check_p2m();
> 
> If the instruction was not decoded, then there is no need to configure 
> the page tables again. We have already done this before.

Hmmmm... I think there are confusing about which sort of decoding I was 
referring to. In this case, I mean if we didn't decode the instruction 
manully, then it is not necessary to call check_p2m().

Do you agree with that?

> So my understanding is as follows :-
> 
>          /* Check that it is instruction abort or ISS is invalid. */

I have had a remark on this line before. Please have a look and address it.

>          if ( !is_data || !info.dabt.valid )
>          {
>              /*
>               * If the instruction was trapped due to access to stage 1 
> translation
>               * then Xen should try to resolve the page table entry for 
> the stage 1
>               * translation table with the assumption that the page 
> tables are
>               * present in the non MMIO region. If it is successful, 
> then it should
>               * ask the guest to retry the instruction.
>               */

I agree that we want to skip the MMIO mapping when s1ptw == 1. However, 
I am not sure this belongs to this patch because this is technically 
already a bug.

>              if ( is_data && info.dabt.s1ptw )
>              {
>                  info.dabt_instr.state = INSTR_RETRY;
>                  /* The translation tables are assumed to be in non MMIO 
> region. */
>                  is_data = false;

is_data is also used to decide which sort of abort we want to send to 
the guest (see after inject_dabt). So I don't think we could force set 
is_data here.

Instead, I would define a new local variable (maybe mmio_access_allowed) 
that will be set for instruction abort or when s1ptw is 1.

>              }
> 
>              /*
>               * Assumption :- Most of the times when we get a 
> translation fault
>               * and the ISS is invalid, the underlying cause is that the 
> page
>               * tables have not been set up correctly.
>               */

I think this comment make more sense on top of "if !is_data || 
!info.dabt.valid".

>              if ( check_p2m(is_data, gpa) )
>                  return;
> 
>              /*
>               * If the instruction abort or the data abort due to access 
> to stage 1
>               * translation tables could not be resolved by setting the 
> appropriate
>               * bits in the translation table, then Xen should abort the 
> guest.

IHMO, "abort the guest" means we are going to crash the guest. However, 
this not the case here. We are telling the guest that we couldn't handle 
the data/instruction request. It is up to the guest to decide whether it 
should panic or handle gracefully the error.

We should also avoid the term guest because it usually only refers to 
any domain but dom0.

Therefore, I would reword it to something like "Xen will forward the 
data/instruction abort to the domain".

>               */
>              if ( !is_data || (info.dabt_instr.state == INSTR_RETRY) )

The second part looks unnecessary.

>                  goto inject_abt;
> 
>              try_decode_instruction(regs, &info);
> 
>              /* Instruction could not be decoded, then abort the guest */
> 
>              if ( info.dabt_instr.state == INSTR_ERROR)
>                  goto inject_abt;
>          }
> 
>          state = try_handle_mmio(regs, &info);
> 
>          switch ( state )
>          {
>              case IO_ABORT:
>                  goto inject_abt;
>              case IO_HANDLED:
>                  /*
>                   * If the instruction was decoded and has executed 
> successfully
>                   * on the MMIO region, then Xen should execute the next 
> part of
>                   * the instruction. (for eg increment the rn if it is a
>                   * post-indexing instruction.
>                   */
>                  post_increment_register(&info.dabt_instr);
>                  advance_pc(regs, hsr);
>                  return;
>              case IO_RETRY:
>                  /* finish later */
>                  return;
>              case IO_UNHANDLED:
>                  /* IO unhandled, try another way to handle it. */
>                  break;
>          }
> 
>          if ( check_p2m(is_data, gpa) )

It is unnecessary to call check_p2M() if we manually decoded the 
instruction (see above why).

Cheers,
Ayan Kumar Halder Feb. 21, 2022, 7:05 p.m. UTC | #4
On 21/02/2022 17:57, Julien Grall wrote:
>
>
> On 21/02/2022 17:05, Ayan Kumar Halder wrote:
>> Hi Julien,
>
> Hi,
Hi Julien,
>
>> On 13/02/2022 12:19, Julien Grall wrote:
>>>>   }
>>>>     void register_mmio_handler(struct domain *d,
>>>> diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
>>>> index 308650b400..3c0a935ccf 100644
>>>> --- a/xen/arch/arm/ioreq.c
>>>> +++ b/xen/arch/arm/ioreq.c
>>>> @@ -47,6 +47,7 @@ enum io_state try_fwd_ioserv(struct cpu_user_regs 
>>>> *regs,
>>>>                                struct vcpu *v, mmio_info_t *info)
>>>>   {
>>>>       struct vcpu_io *vio = &v->io;
>>>> +    struct dabt_instr instr = info->dabt_instr;
>>>>       ioreq_t p = {
>>>>           .type = IOREQ_TYPE_COPY,
>>>>           .addr = info->gpa,
>>>> @@ -76,10 +77,8 @@ enum io_state try_fwd_ioserv(struct 
>>>> cpu_user_regs *regs,
>>>>       if ( !s )
>>>>           return IO_UNHANDLED;
>>>>   -    if ( !info->dabt.valid )
>>>> -        return IO_ABORT;
>>>> -
>>>
>>> For this one, I would switch to ASSERT(dabt.valid);
>> I see that try_fwd_ioserv() is invoked from try_handle_mmio() only. 
>> Thus, if I follow your suggestion of adding a check for dabt.valid at 
>> the beginning of try_handle_mmio(), then this ASSERT() is not required.
>
> I agree that try_handle_mmio() is the only caller today. But we don't 
> know how this is going to be used tomorrow.
>
> The goal of this ASSERT() is to catch those new users that would call 
> it wrongly.
>
> [...]
>
>>> ... this will inject a data abort to the guest when we can't decode. 
>>> This is not what we want. We should check whether this is a P2M 
>>> translation fault or we need to map an MMIO region.
>>>
>>> In pseudo-code, this would look like:
>>>
>>> if ( !is_data || hsr.dabt.valid )
>>
>> I think you mean if ( !is_data || !hsr.dabt.valid )
>
> You are right.
>
>>
>> The reason being if there is an instruction abort or a data abort 
>> (with ISV == 0), then it should try to configure the page tables.
>>
>>> {
>>>     if ( check_p2m() )
>>>       return;
>>>
>>>
>>>     if ( !is_data )
>>>        goto inject_dabt;
>>>
>>>     decode_instruction();
>>>     if ( !dabt.invalid )
>>>       goto inject_dabt;
>>> }
>>>
>>> try_handle_mmio();
>>>
>>> if ( instruction was not decoded )
>>>   check_p2m();
>>
>> If the instruction was not decoded, then there is no need to 
>> configure the page tables again. We have already done this before.
>
> Hmmmm... I think there are confusing about which sort of decoding I 
> was referring to. In this case, I mean if we didn't decode the 
> instruction manully, then it is not necessary to call check_p2m().
>
> Do you agree with that?

If we (ie Xen) didn't decode the instruction manually, then check_p2m() 
has not been invoked yet.  This is because of the following 
(info.dabt.valid == True) :-

         if ( !is_data || !info.dabt.valid )
         {

                 ...

                 if ( check_p2m(is_data, gpa) )
                     return;

                 ...

         }

So, in this scenario ( !info.dabt.valid), it would not be necessary to 
invoke check_p2m() after try_handle_mmio().

However, if we havenot decoded the instruction manually (ie 
info.dabt.valid == True), and try_handle_mmio() returns IO_UNHANDLED, 
then it will be necessary to invoke "check_p2m(is_data, gpa)"

- Ayan

>
>> So my understanding is as follows :-
>>
>>          /* Check that it is instruction abort or ISS is invalid. */
>
> I have had a remark on this line before. Please have a look and 
> address it.
>
>>          if ( !is_data || !info.dabt.valid )
>>          {
>>              /*
>>               * If the instruction was trapped due to access to stage 
>> 1 translation
>>               * then Xen should try to resolve the page table entry 
>> for the stage 1
>>               * translation table with the assumption that the page 
>> tables are
>>               * present in the non MMIO region. If it is successful, 
>> then it should
>>               * ask the guest to retry the instruction.
>>               */
>
> I agree that we want to skip the MMIO mapping when s1ptw == 1. 
> However, I am not sure this belongs to this patch because this is 
> technically already a bug.
>
>>              if ( is_data && info.dabt.s1ptw )
>>              {
>>                  info.dabt_instr.state = INSTR_RETRY;
>>                  /* The translation tables are assumed to be in non 
>> MMIO region. */
>>                  is_data = false;
>
> is_data is also used to decide which sort of abort we want to send to 
> the guest (see after inject_dabt). So I don't think we could force set 
> is_data here.
>
> Instead, I would define a new local variable (maybe 
> mmio_access_allowed) that will be set for instruction abort or when 
> s1ptw is 1.
>
>>              }
>>
>>              /*
>>               * Assumption :- Most of the times when we get a 
>> translation fault
>>               * and the ISS is invalid, the underlying cause is that 
>> the page
>>               * tables have not been set up correctly.
>>               */
>
> I think this comment make more sense on top of "if !is_data || 
> !info.dabt.valid".
>
>>              if ( check_p2m(is_data, gpa) )
>>                  return;
>>
>>              /*
>>               * If the instruction abort or the data abort due to 
>> access to stage 1
>>               * translation tables could not be resolved by setting 
>> the appropriate
>>               * bits in the translation table, then Xen should abort 
>> the guest.
>
> IHMO, "abort the guest" means we are going to crash the guest. 
> However, this not the case here. We are telling the guest that we 
> couldn't handle the data/instruction request. It is up to the guest to 
> decide whether it should panic or handle gracefully the error.
>
> We should also avoid the term guest because it usually only refers to 
> any domain but dom0.
>
> Therefore, I would reword it to something like "Xen will forward the 
> data/instruction abort to the domain".
>
>>               */
>>              if ( !is_data || (info.dabt_instr.state == INSTR_RETRY) )
>
> The second part looks unnecessary.
>
>>                  goto inject_abt;
>>
>>              try_decode_instruction(regs, &info);
>>
>>              /* Instruction could not be decoded, then abort the 
>> guest */
>>
>>              if ( info.dabt_instr.state == INSTR_ERROR)
>>                  goto inject_abt;
>>          }
>>
>>          state = try_handle_mmio(regs, &info);
>>
>>          switch ( state )
>>          {
>>              case IO_ABORT:
>>                  goto inject_abt;
>>              case IO_HANDLED:
>>                  /*
>>                   * If the instruction was decoded and has executed 
>> successfully
>>                   * on the MMIO region, then Xen should execute the 
>> next part of
>>                   * the instruction. (for eg increment the rn if it is a
>>                   * post-indexing instruction.
>>                   */
>>                  post_increment_register(&info.dabt_instr);
>>                  advance_pc(regs, hsr);
>>                  return;
>>              case IO_RETRY:
>>                  /* finish later */
>>                  return;
>>              case IO_UNHANDLED:
>>                  /* IO unhandled, try another way to handle it. */
>>                  break;
>>          }
>>
>>          if ( check_p2m(is_data, gpa) )
>
> It is unnecessary to call check_p2M() if we manually decoded the 
> instruction (see above why).
>
> Cheers,
>
Julien Grall Feb. 21, 2022, 7:13 p.m. UTC | #5
On 21/02/2022 19:05, Ayan Kumar Halder wrote:
> If we (ie Xen) didn't decode the instruction manually, then check_p2m() 
> has not been invoked yet.  This is because of the following 
> (info.dabt.valid == True) :-
> 
>          if ( !is_data || !info.dabt.valid )
>          {
> 
>                  ...
> 
>                  if ( check_p2m(is_data, gpa) )
>                      return;
> 
>                  ...
> 
>          }
> 
> So, in this scenario ( !info.dabt.valid), it would not be necessary to 
> invoke check_p2m() after try_handle_mmio().
> 
> However, if we havenot decoded the instruction manually (ie 
> info.dabt.valid == True), and try_handle_mmio() returns IO_UNHANDLED, 
> then it will be necessary to invoke "check_p2m(is_data, gpa)"

Hmmm you are right. But this doesn't seem to match the code you wrote 
below. What did I miss?

Cheers,
Ayan Kumar Halder Feb. 21, 2022, 9:10 p.m. UTC | #6
Hi Julien,

On 21/02/2022 19:13, Julien Grall wrote:
>
>
> On 21/02/2022 19:05, Ayan Kumar Halder wrote:
>> If we (ie Xen) didn't decode the instruction manually, then 
>> check_p2m() has not been invoked yet.  This is because of the 
>> following (info.dabt.valid == True) :-
>>
>>          if ( !is_data || !info.dabt.valid )
>>          {
>>
>>                  ...
>>
>>                  if ( check_p2m(is_data, gpa) )
>>                      return;
>>
>>                  ...
>>
>>          }
>>
>> So, in this scenario ( !info.dabt.valid), it would not be necessary 
>> to invoke check_p2m() after try_handle_mmio().
>>
>> However, if we havenot decoded the instruction manually (ie 
>> info.dabt.valid == True), and try_handle_mmio() returns IO_UNHANDLED, 
>> then it will be necessary to invoke "check_p2m(is_data, gpa)"
>
> Hmmm you are right. But this doesn't seem to match the code you wrote 
> below. What did I miss?

My code was not correct.  I have rectified it as below. Please let me 
know if it looks sane.

<snip>

     case FSC_FLT_TRANS:
     {
         info.gpa = gpa;
         info.dabt = hsr.dabt;

         /*
          * Assumption :- Most of the times when we get a data abort and 
the ISS
          * is invalid or an instruction abort, the underlying cause is 
that the
          * page tables have not been set up correctly.
          */
         if ( !is_data || !info.dabt.valid )
         {
             if ( check_p2m(is_data, gpa) )
                 return;

             /*
              * If the instruction abort could not be resolved by 
setting the
              * appropriate bits in the translation table, then Xen should
              * forward the abort to the guest.
              */
             if ( !is_data )
                 goto inject_abt;

             try_decode_instruction(regs, &info);

             /*
              * If Xen could not decode the instruction or encountered 
an error
              * while decoding, then it should forward the abort to the 
guest.
              */
             if ( info.dabt_instr.state == INSTR_ERROR )
                 goto inject_abt;
         }

         state = try_handle_mmio(regs, &info);

         switch ( state )
         {
             case IO_ABORT:
                 goto inject_abt;
             case IO_HANDLED:
                 /*
                  * If the instruction was decoded and has executed 
successfully
                  * on the MMIO region, then Xen should execute the next 
part of
                  * the instruction. (for eg increment the rn if it is a
                  * post-indexing instruction.
                  */
                 post_increment_register(&info.dabt_instr);
                 advance_pc(regs, hsr);
                 return;
             case IO_RETRY:
                 /* finish later */
                 return;
             case IO_UNHANDLED:
                 /* IO unhandled, try another way to handle it. */
                 break;
         }

         /*
          * If the instruction was valid but Xen could not emulate the 
instruction
          * then it should configure the page tables to set the correct 
page table
          * entry corresponding to the faulting address. If it was 
successful, it
          * should return to the guest to retry the instruction (hoping 
that the
          * instruction will not be trapped to Xen again).
          * However, if Xen was not successful in setting the page 
tables, then
          * it should forward the abort to the guest.
          */
         if ( info.dabt.valid && check_p2m(is_data, gpa) )
             return;

         break;
     }
     default:

<snip>

- Ayan

>
> Cheers,
>
Julien Grall Feb. 23, 2022, 7:19 p.m. UTC | #7
On 21/02/2022 21:10, Ayan Kumar Halder wrote:
> Hi Julien,

Hi,

> 
> On 21/02/2022 19:13, Julien Grall wrote:
>>
>>
>> On 21/02/2022 19:05, Ayan Kumar Halder wrote:
>>> If we (ie Xen) didn't decode the instruction manually, then 
>>> check_p2m() has not been invoked yet.  This is because of the 
>>> following (info.dabt.valid == True) :-
>>>
>>>          if ( !is_data || !info.dabt.valid )
>>>          {
>>>
>>>                  ...
>>>
>>>                  if ( check_p2m(is_data, gpa) )
>>>                      return;
>>>
>>>                  ...
>>>
>>>          }
>>>
>>> So, in this scenario ( !info.dabt.valid), it would not be necessary 
>>> to invoke check_p2m() after try_handle_mmio().
>>>
>>> However, if we havenot decoded the instruction manually (ie 
>>> info.dabt.valid == True), and try_handle_mmio() returns IO_UNHANDLED, 
>>> then it will be necessary to invoke "check_p2m(is_data, gpa)"
>>
>> Hmmm you are right. But this doesn't seem to match the code you wrote 
>> below. What did I miss?
> 
> My code was not correct.  I have rectified it as below. Please let me 
> know if it looks sane.

This looks good to me with one remark below.

> 
> <snip>
> 
>      case FSC_FLT_TRANS:
>      {
>          info.gpa = gpa;
>          info.dabt = hsr.dabt;
> 
>          /*
>           * Assumption :- Most of the times when we get a data abort and 
> the ISS
>           * is invalid or an instruction abort, the underlying cause is 
> that the
>           * page tables have not been set up correctly.
>           */
>          if ( !is_data || !info.dabt.valid )
>          {
>              if ( check_p2m(is_data, gpa) )
>                  return;
> 
>              /*
>               * If the instruction abort could not be resolved by 
> setting the
>               * appropriate bits in the translation table, then Xen should
>               * forward the abort to the guest.
>               */
>              if ( !is_data )
>                  goto inject_abt;
> 
>              try_decode_instruction(regs, &info);
> 
>              /*
>               * If Xen could not decode the instruction or encountered 
> an error
>               * while decoding, then it should forward the abort to the 
> guest.
>               */
>              if ( info.dabt_instr.state == INSTR_ERROR )
>                  goto inject_abt;
>          }
> 
>          state = try_handle_mmio(regs, &info);
> 
>          switch ( state )
>          {
>              case IO_ABORT:
>                  goto inject_abt;
>              case IO_HANDLED:
>                  /*
>                   * If the instruction was decoded and has executed 
> successfully
>                   * on the MMIO region, then Xen should execute the next 
> part of
>                   * the instruction. (for eg increment the rn if it is a
>                   * post-indexing instruction.
>                   */
>                  post_increment_register(&info.dabt_instr);
>                  advance_pc(regs, hsr);
>                  return;
>              case IO_RETRY:
>                  /* finish later */
>                  return;
>              case IO_UNHANDLED:
>                  /* IO unhandled, try another way to handle it. */
>                  break;
>          }
> 
>          /*
>           * If the instruction was valid but Xen could not emulate the 
> instruction
>           * then it should configure the page tables to set the correct 
> page table
>           * entry corresponding to the faulting address. If it was 
> successful, it
>           * should return to the guest to retry the instruction (hoping 
> that the
>           * instruction will not be trapped to Xen again).
>           * However, if Xen was not successful in setting the page 
> tables, then
>           * it should forward the abort to the guest.
>           */

I would shorten to:

If the instruction syndrome was invalid, then we already checked if this 
was due to a P2M fault. So no point to check again as the result will be 
the same.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/arm32/traps.c b/xen/arch/arm/arm32/traps.c
index 9c9790a6d1..70c6238196 100644
--- a/xen/arch/arm/arm32/traps.c
+++ b/xen/arch/arm/arm32/traps.c
@@ -18,9 +18,11 @@ 
 
 #include <xen/lib.h>
 #include <xen/kernel.h>
+#include <xen/sched.h>
 
 #include <public/xen.h>
 
+#include <asm/mmio.h>
 #include <asm/processor.h>
 #include <asm/traps.h>
 
@@ -82,6 +84,11 @@  void do_trap_data_abort(struct cpu_user_regs *regs)
         do_unexpected_trap("Data Abort", regs);
 }
 
+void post_increment_register(const struct instr_details *instr)
+{
+    domain_crash(current->domain);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/arm64/traps.c b/xen/arch/arm/arm64/traps.c
index 9113a15c7a..a6766689b3 100644
--- a/xen/arch/arm/arm64/traps.c
+++ b/xen/arch/arm/arm64/traps.c
@@ -23,6 +23,7 @@ 
 #include <asm/processor.h>
 
 #include <public/xen.h>
+#include <xen/sched.h>
 
 static const char *handler[]= {
         "Synchronous Abort",
@@ -44,6 +45,52 @@  void do_bad_mode(struct cpu_user_regs *regs, int reason)
     panic("bad mode\n");
 }
 
+void post_increment_register(const struct instr_details *instr)
+{
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    register_t val = 0;
+
+    /* Currently, we handle only ldr/str post indexing instructions */
+    if ( instr->state != INSTR_LDR_STR_POSTINDEXING )
+        return;
+
+    /*
+     * Handle when rn = SP
+     * Refer ArmV8 ARM DDI 0487G.b, Page - D1-2463 "Stack pointer register
+     * selection"
+     * t = SP_EL0
+     * h = SP_ELx
+     * and M[3:0] (Page - C5-474 "When exception taken from AArch64 state:")
+     */
+    if (instr->rn == 31 )
+    {
+        if ( (regs->cpsr & PSR_MODE_MASK) == PSR_MODE_EL1h )
+            val = regs->sp_el1;
+        else if ( ((regs->cpsr & PSR_MODE_MASK) == PSR_MODE_EL1t) ||
+                    ((regs->cpsr & PSR_MODE_MASK) == PSR_MODE_EL0t) )
+            val = regs->sp_el0;
+        else
+        {
+            domain_crash(current->domain);
+            return;
+        }
+    }
+    else
+        val = get_user_reg(regs, instr->rn);
+
+    val += instr->imm9;
+
+    if ( instr->rn == 31 )
+    {
+        if ( (regs->cpsr & PSR_MODE_MASK) == PSR_MODE_EL1h )
+            regs->sp_el1 = val;
+        else
+            regs->sp_el0 = val;
+    }
+    else
+        set_user_reg(regs, instr->rn, val);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/decode.c b/xen/arch/arm/decode.c
index 3c3cd703e0..e90f95ecd3 100644
--- a/xen/arch/arm/decode.c
+++ b/xen/arch/arm/decode.c
@@ -146,6 +146,7 @@  static int decode_arm64(register_t pc, mmio_info_t *info)
 
     update_dabt(dabt, opcode.ldr_str.rt, opcode.ldr_str.size, false);
 
+    dabt_instr->state = INSTR_LDR_STR_POSTINDEXING;
     dabt_instr->rn = opcode.ldr_str.rn;
     dabt_instr->imm9 = opcode.ldr_str.imm9;
 
diff --git a/xen/arch/arm/include/asm/domain.h b/xen/arch/arm/include/asm/domain.h
index 9b3647587a..40b9125141 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -266,6 +266,10 @@  static inline void arch_vcpu_block(struct vcpu *v) {}
 /* vPCI is not available on Arm */
 #define has_vpci(d)    ({ (void)(d); false; })
 
+struct arch_vcpu_io {
+    struct instr_details dabt_instr; /* when the instruction is decoded */
+};
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h
index 3354d9c635..745130b7fe 100644
--- a/xen/arch/arm/include/asm/mmio.h
+++ b/xen/arch/arm/include/asm/mmio.h
@@ -26,12 +26,22 @@ 
 
 #define MAX_IO_HANDLER  16
 
+enum instr_decode_state
+{
+    INSTR_ERROR,                    /* Error encountered while decoding instr */
+    INSTR_VALID,                    /* ISS is valid, so no need to decode */
+    INSTR_LDR_STR_POSTINDEXING,     /* Instruction is decoded successfully.
+                                       It is ldr/str post indexing */
+    INSTR_RETRY                     /* Instruction is to be retried */
+};
+
 typedef struct
 {
     struct hsr_dabt dabt;
     struct instr_details {
         unsigned long rn:5;
         signed int imm9:9;
+        enum instr_decode_state state;
     } dabt_instr;
     paddr_t gpa;
 } mmio_info_t;
@@ -69,14 +79,15 @@  struct vmmio {
 };
 
 enum io_state try_handle_mmio(struct cpu_user_regs *regs,
-                              const union hsr hsr,
-                              paddr_t gpa);
+                              mmio_info_t *info);
 void register_mmio_handler(struct domain *d,
                            const struct mmio_handler_ops *ops,
                            paddr_t addr, paddr_t size, void *priv);
 int domain_io_init(struct domain *d, int max_count);
 void domain_io_free(struct domain *d);
 
+void try_decode_instruction(const struct cpu_user_regs *regs,
+                            mmio_info_t *info);
 
 #endif  /* __ASM_ARM_MMIO_H__ */
 
diff --git a/xen/arch/arm/include/asm/traps.h b/xen/arch/arm/include/asm/traps.h
index 2ed2b85c6f..95c46ad391 100644
--- a/xen/arch/arm/include/asm/traps.h
+++ b/xen/arch/arm/include/asm/traps.h
@@ -109,6 +109,8 @@  static inline register_t sign_extend(const struct hsr_dabt dabt, register_t r)
     return r;
 }
 
+void post_increment_register(const struct instr_details *instr);
+
 #endif /* __ASM_ARM_TRAPS__ */
 /*
  * Local variables:
diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
index a289d393f9..203466b869 100644
--- a/xen/arch/arm/io.c
+++ b/xen/arch/arm/io.c
@@ -95,57 +95,87 @@  static const struct mmio_handler *find_mmio_handler(struct domain *d,
     return handler;
 }
 
+void try_decode_instruction(const struct cpu_user_regs *regs,
+                            mmio_info_t *info)
+{
+    int rc;
+
+    /*
+     * Erratum 766422: Thumb store translation fault to Hypervisor may
+     * not have correct HSR Rt value.
+     */
+    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
+         info->dabt.write )
+    {
+        rc = decode_instruction(regs, info);
+        if ( rc )
+        {
+            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
+            info->dabt_instr.state = INSTR_ERROR;
+            return;
+        }
+    }
+
+    /* If ISS is valid, then no need to decode the instruction any further */
+    if ( info->dabt.valid )
+    {
+        info->dabt_instr.state = INSTR_VALID;
+        return;
+    }
+
+    /*
+     * Xen should not decode the instruction when it was trapped due to
+     * translation fault.
+     */
+    if ( info->dabt.s1ptw )
+    {
+        info->dabt_instr.state = INSTR_RETRY;
+        return;
+    }
+
+    /*
+     * Armv8 processor does not provide a valid syndrome for decoding some
+     * instructions. So in order to process these instructions, Xen must
+     * decode them.
+     */
+    rc = decode_instruction(regs, info);
+    if ( rc )
+    {
+        gprintk(XENLOG_ERR, "Unable to decode instruction\n");
+        info->dabt_instr.state = INSTR_ERROR;
+    }
+}
+
 enum io_state try_handle_mmio(struct cpu_user_regs *regs,
-                              const union hsr hsr,
-                              paddr_t gpa)
+                              mmio_info_t *info)
 {
     struct vcpu *v = current;
     const struct mmio_handler *handler = NULL;
-    const struct hsr_dabt dabt = hsr.dabt;
-    mmio_info_t info = {
-        .gpa = gpa,
-        .dabt = dabt
-    };
+    int rc;
 
-    ASSERT(hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
+    ASSERT(info->dabt.ec == HSR_EC_DATA_ABORT_LOWER_EL);
 
-    handler = find_mmio_handler(v->domain, info.gpa);
+    handler = find_mmio_handler(v->domain, info->gpa);
     if ( !handler )
     {
-        int rc;
-
-        rc = try_fwd_ioserv(regs, v, &info);
+        rc = try_fwd_ioserv(regs, v, info);
         if ( rc == IO_HANDLED )
             return handle_ioserv(regs, v);
 
         return rc;
     }
 
-    /* All the instructions used on emulated MMIO region should be valid */
-    if ( !dabt.valid )
-        return IO_ABORT;
-
     /*
-     * Erratum 766422: Thumb store translation fault to Hypervisor may
-     * not have correct HSR Rt value.
+     * At this point, we know that the instruction is either valid or has been
+     * decoded successfully. Thus, Xen should be allowed to execute the
+     * instruction on the emulated MMIO region.
      */
-    if ( check_workaround_766422() && (regs->cpsr & PSR_THUMB) &&
-         dabt.write )
-    {
-        int rc;
-
-        rc = decode_instruction(regs, &info);
-        if ( rc )
-        {
-            gprintk(XENLOG_DEBUG, "Unable to decode instruction\n");
-            return IO_ABORT;
-        }
-    }
-
-    if ( info.dabt.write )
-        return handle_write(handler, v, &info);
+    if ( info->dabt.write )
+        rc = handle_write(handler, v, info);
     else
-        return handle_read(handler, v, &info);
+        rc = handle_read(handler, v, info);
+
+    return rc;
 }
 
 void register_mmio_handler(struct domain *d,
diff --git a/xen/arch/arm/ioreq.c b/xen/arch/arm/ioreq.c
index 308650b400..3c0a935ccf 100644
--- a/xen/arch/arm/ioreq.c
+++ b/xen/arch/arm/ioreq.c
@@ -47,6 +47,7 @@  enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
                              struct vcpu *v, mmio_info_t *info)
 {
     struct vcpu_io *vio = &v->io;
+    struct dabt_instr instr = info->dabt_instr;
     ioreq_t p = {
         .type = IOREQ_TYPE_COPY,
         .addr = info->gpa,
@@ -76,10 +77,8 @@  enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
     if ( !s )
         return IO_UNHANDLED;
 
-    if ( !info->dabt.valid )
-        return IO_ABORT;
-
     vio->req = p;
+    vio->info.dabt_instr = instr;
 
     rc = ioreq_send(s, &p, 0);
     if ( rc != IO_RETRY || v->domain->is_shutting_down )
@@ -95,6 +94,7 @@  enum io_state try_fwd_ioserv(struct cpu_user_regs *regs,
 bool arch_ioreq_complete_mmio(void)
 {
     struct vcpu *v = current;
+    struct instr_details dabt_instr = v->io.info.dabt_instr;
     struct cpu_user_regs *regs = guest_cpu_user_regs();
     const union hsr hsr = { .bits = regs->hsr };
 
@@ -106,6 +106,7 @@  bool arch_ioreq_complete_mmio(void)
 
     if ( handle_ioserv(regs, v) == IO_HANDLED )
     {
+        post_increment_register(&dabt_instr);
         advance_pc(regs, hsr);
         return true;
     }
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 9339d12f58..455e51cdbe 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1893,6 +1893,21 @@  static bool try_map_mmio(gfn_t gfn)
     return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
 }
 
+static inline bool check_p2m(bool is_data, paddr_t gpa)
+{
+    /*
+     * First check if the translation fault can be resolved by the P2M subsystem.
+     * If that's the case nothing else to do.
+     */
+    if ( p2m_resolve_translation_fault(current->domain,gaddr_to_gfn(gpa)) )
+        return true;
+
+    if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
+        return true;
+
+    return false;
+}
+
 static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
                                        const union hsr hsr)
 {
@@ -1906,6 +1921,7 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
     paddr_t gpa;
     uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
     bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
+    mmio_info_t info;
 
     /*
      * If this bit has been set, it means that this stage-2 abort is caused
@@ -1959,6 +1975,25 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
         return;
     }
     case FSC_FLT_TRANS:
+    {
+        info.gpa = gpa;
+        info.dabt = hsr.dabt;
+
+        /* Check that the ISS is invalid and it is not data abort. */
+        if ( !hsr.dabt.valid && !is_data )
+        {
+
+            /*
+             * Assumption :- Most of the times when we get a translation fault
+             * and the ISS is invalid, the underlying cause is that the page
+             * tables have not been set up correctly.
+             */
+            if ( check_p2m(is_data, gpa) )
+                return;
+            else
+                goto inject_abt;
+        }
+
         /*
          * Attempt first to emulate the MMIO as the data abort will
          * likely happen in an emulated region.
@@ -1967,13 +2002,45 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
          */
         if ( is_data )
         {
-            enum io_state state = try_handle_mmio(regs, hsr, gpa);
+            enum io_state state;
+
+            try_decode_instruction(regs, &info);
+
+            /*
+             * If Xen could not decode the instruction for any reason, then it
+             * should ask the caller to abort the guest.
+             */
+            if ( info.dabt_instr.state == INSTR_ERROR )
+                goto inject_abt;
+
+            /*
+             * When the instruction needs to be retried by the guest after
+             * resolving the translation fault.
+             */
+            else if ( info.dabt_instr.state == INSTR_RETRY )
+            {
+                /*
+                 * Try resolving the translation fault for access to the stage1
+                 * translation tables which should be in non MMIO region.
+                 */
+                if ( !check_p2m(false, gpa) )
+                    goto inject_abt;
+            }
+
+            state = try_handle_mmio(regs, &info);
 
             switch ( state )
             {
             case IO_ABORT:
                 goto inject_abt;
             case IO_HANDLED:
+                /*
+                 * If the instruction was decoded and has executed successfully
+                 * on the MMIO region, then Xen should execute the next part of
+                 * the instruction. (for eg increment the rn if it is a
+                 * post-indexing instruction.
+                 */
+                post_increment_register(&info.dabt_instr);
                 advance_pc(regs, hsr);
                 return;
             case IO_RETRY:
@@ -1985,18 +2052,11 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
             }
         }
 
-        /*
-         * First check if the translation fault can be resolved by the
-         * P2M subsystem. If that's the case nothing else to do.
-         */
-        if ( p2m_resolve_translation_fault(current->domain,
-                                           gaddr_to_gfn(gpa)) )
-            return;
-
-        if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
+        if ( check_p2m(is_data, gpa) )
             return;
 
         break;
+    }
     default:
         gprintk(XENLOG_WARNING,
                 "Unsupported FSC: HSR=%#"PRIregister" DFSC=%#x\n",
diff --git a/xen/arch/x86/include/asm/ioreq.h b/xen/arch/x86/include/asm/ioreq.h
index d06ce9a6ea..ecfe7f9fdb 100644
--- a/xen/arch/x86/include/asm/ioreq.h
+++ b/xen/arch/x86/include/asm/ioreq.h
@@ -26,6 +26,9 @@ 
 #include <asm/hvm/ioreq.h>
 #endif
 
+struct arch_vcpu_io {
+};
+
 #endif /* __ASM_X86_IOREQ_H__ */
 
 /*
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 37f78cc4c4..afe5508be8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -160,6 +160,8 @@  struct vcpu_io {
     /* I/O request in flight to device model. */
     enum vio_completion  completion;
     ioreq_t              req;
+    /* Arch specific info pertaining to the io request */
+    struct arch_vcpu_io  info;
 };
 
 struct vcpu