mbox series

[v2,0/2] Drop ignore_memory_transaction_failures for xilink_zynq

Message ID cover.1727425255.git.chao.liu@yeah.net (mailing list archive)
Headers show
Series Drop ignore_memory_transaction_failures for xilink_zynq | expand

Message

Chao Liu Sept. 27, 2024, 8:51 a.m. UTC
Hi, thank you for your prompt reply, it's a great encouragement to me!

Based on your review suggestions, I have improved the v1 patch.

By using create_unimplemented_device() during the initialization phase,
I added a "znyq.umip" device early on, which covers the 32-bit address space
of GPA. This can better serve as a replacement for the effect of the
ignore_memory_transaction_failures flag.

Since create_unimplemented_device() sets the priority of the
memory region (mr) to -100, normally created devices will override the address
segments corresponding to the unimplemented devices.

Even if our test set is not sufficiently comprehensive, we can create an
unimp_device for the maximum address space allowed by the board. This prevents
the guest system from triggering unexpected exceptions when accessing
unimplemented devices or regions.

Additionally, I still use create_unimplemented_device() for other
unimplemented devices. This makes it easier to debug when these devices
are added later.

Finally, here are my testing steps:

Step 1, Referring to the Xilinx Wiki,
I compiled a Linux kernel binary image for convenience in testing.
You can directly obtain it via:

git clone https://github.com/zevorn/QEMU_CPUFreq_Zynq.git

Step 2, Use the following command to run the QEMU:

./qemu/build/qemu-system-arm -M xilinx-zynq-a9 \
-serial /dev/null \
-serial mon:stdio \
-display none \
-kernel QEMU_CPUFreq_Zynq/Prebuilt_functional/kernel_standard_linux/uImage \
-dtb QEMU_CPUFreq_Zynq/Prebuilt_functional/my_devicetree.dtb \
--initrd QEMU_CPUFreq_Zynq/Prebuilt_functional/umy_ramdisk.image.gz

If there are no issues during execution and it boots successfully
into the terminal, for example:

...

PetaLinux 2016.4 zedboard-zynq7 /dev/ttyPS0
zedboard-zynq7 login: 
root
root@zedboard-zynq7:~#


Chao Liu (2):
  xilink_zynq: Add various missing unimplemented devices
  xilink-zynq-devcfg: Fix up for memory address range size not set
    correctly

 hw/arm/xilinx_zynq.c      | 46 ++++++++++++++++++++++++++++++++++++++-
 hw/dma/xlnx-zynq-devcfg.c |  2 +-
 2 files changed, 46 insertions(+), 2 deletions(-)

Comments

Peter Maydell Sept. 27, 2024, 12:18 p.m. UTC | #1
On Fri, 27 Sept 2024 at 09:52, Chao Liu <chao.liu@yeah.net> wrote:
>
> Hi, thank you for your prompt reply, it's a great encouragement to me!
>
> Based on your review suggestions, I have improved the v1 patch.
>
> By using create_unimplemented_device() during the initialization phase,
> I added a "znyq.umip" device early on, which covers the 32-bit address space
> of GPA. This can better serve as a replacement for the effect of the
> ignore_memory_transaction_failures flag.
>
> Since create_unimplemented_device() sets the priority of the
> memory region (mr) to -100, normally created devices will override the address
> segments corresponding to the unimplemented devices.
>
> Even if our test set is not sufficiently comprehensive, we can create an
> unimp_device for the maximum address space allowed by the board. This prevents
> the guest system from triggering unexpected exceptions when accessing
> unimplemented devices or regions.

What would be the benefit of doing that? If we're going to
say "we'll make accesses to regions without devices not
generate faults", the simplest way to do that is to
leave the ignore_memory_transaction_failures flag set
the way it is.

thanks
-- PMM
Chao Liu Sept. 27, 2024, 2:03 p.m. UTC | #2
On 2024/9/27 20:18, Peter Maydell wrote:

> On Fri, 27 Sept 2024 at 09:52, Chao Liu<chao.liu@yeah.net> wrote:
>> Hi, thank you for your prompt reply, it's a great encouragement to me!
>>
>> Based on your review suggestions, I have improved the v1 patch.
>>
>> By using create_unimplemented_device() during the initialization phase,
>> I added a "znyq.umip" device early on, which covers the 32-bit address space
>> of GPA. This can better serve as a replacement for the effect of the
>> ignore_memory_transaction_failures flag.
>>
>> Since create_unimplemented_device() sets the priority of the
>> memory region (mr) to -100, normally created devices will override the address
>> segments corresponding to the unimplemented devices.
>>
>> Even if our test set is not sufficiently comprehensive, we can create an
>> unimp_device for the maximum address space allowed by the board. This prevents
>> the guest system from triggering unexpected exceptions when accessing
>> unimplemented devices or regions.
> What would be the benefit of doing that? If we're going to
> say "we'll make accesses to regions without devices not
> generate faults", the simplest way to do that is to
> leave the ignore_memory_transaction_failures flag set
> the way it is.
>
> thanks
> -- PMM

I noticed that the `ignore_memory_transaction_failures` flag
was introduced in ed860129ac ("boards.h: Define new flag
ignore_memory_transaction_failures")

This approach was wise given the circumstances at the time.

Initially, this flag was added to ensure compatibility with the
RAZ/WI behavior in the ARM legacy board model.

Currently, only the ARM legacy board model uses this flag.

Introducing this flag provides a straightforward way to suppress
memory access exceptions by checking if the flag is enabled after
a CPU memory access failure; however,its primary purpose is to
ensure compatibility.

The purpose was to ensure that the ARM legacy board model behaves
as expected under conditions where thorough testing was not feasible.

Since we can designate unimplemented device memory ranges with
"unimplemented-device," this represents a more standard approach in QEMU
for managing RAZ/WI behavior.

However, this approach requires some effort.

Consequently, I have prioritized the removal of the
ignore_memory_transaction_failures flag on the Xilinx Zynq board
and aim to replace it with a more general solution to enhance design
simplicity and consistency.

If my approach is approved, I am very glad to systematically remove the
ignore_memory_transaction_failures flag from other ARM legacy boards and
ultimately eliminate it from the MachineClass.

This is my first attempt at contributing patches to the QEMU community,
and there is much for me to learn, and thanks for your patience and efforts!

Best regards,
Chao Liu
Peter Maydell Sept. 27, 2024, 2:20 p.m. UTC | #3
On Fri, 27 Sept 2024 at 15:03, Chao Liu <chao.liu@yeah.net> wrote:
> On 2024/9/27 20:18, Peter Maydell wrote:
>> On Fri, 27 Sept 2024 at 09:52, Chao Liu <chao.liu@yeah.net> wrote:
>> Even if our test set is not sufficiently comprehensive, we can create an
>> unimp_device for the maximum address space allowed by the board. This prevents
>> the guest system from triggering unexpected exceptions when accessing
>> unimplemented devices or regions.
>
> What would be the benefit of doing that? If we're going to
> say "we'll make accesses to regions without devices not
> generate faults", the simplest way to do that is to
> leave the ignore_memory_transaction_failures flag set
> the way it is.

> Introducing this flag provides a straightforward way to suppress
> memory access exceptions by checking if the flag is enabled after
> a CPU memory access failure; however,its primary purpose is to
> ensure compatibility.

> Since we can designate unimplemented device memory ranges with
> "unimplemented-device," this represents a more standard approach in QEMU
> for managing RAZ/WI behavior.

I don't think that using a 4GB unimplemented-device is
a "more standard" way to do this. We have a standard way for
the board model to say "we don't know whether there might
be existing guest code out there that relies on being able
to make accesses to addresses where there should be a device
but we haven't modeled it". That way is to set the
ignore_memory_transaction_failures flag.

There are two things we can do:

(1) We can leave the ignore_memory_transaction_failures
flag set. This is safe (no behaviour change) but not the
right (matching the hardware) behaviour. The main reason
to do this is if we don't feel we have enough access to
a range of guest code to test the other approach.

(2) We can clear the flag. This is preferable (it matches the
hardware). But the requirement to do this is that
 (a) we must make the best effort we can to be sure we've
     put unimplemented-device placeholders for specific
     devices we don't yet model (by checking e.g. the
     hardware documentation for the SoC and board model,
     the device tree, etc)
 (b) we do the most wide-ranging testing of guest code that
     we can. This checks that we didn't miss anything in (a).

I don't mind which of these we do. What I was asking in my
comments on version one of your patch was for how we were
doing on requirement 2b.

thanks
-- PMM
Chao Liu Sept. 27, 2024, 2:43 p.m. UTC | #4
On 2024/9/27 22:20, Peter Maydell wrote:

> On Fri, 27 Sept 2024 at 15:03, Chao Liu<chao.liu@yeah.net> wrote:
>> On 2024/9/27 20:18, Peter Maydell wrote:
>>> On Fri, 27 Sept 2024 at 09:52, Chao Liu<chao.liu@yeah.net> wrote:
>>> Even if our test set is not sufficiently comprehensive, we can create an
>>> unimp_device for the maximum address space allowed by the board. This prevents
>>> the guest system from triggering unexpected exceptions when accessing
>>> unimplemented devices or regions.
>> What would be the benefit of doing that? If we're going to
>> say "we'll make accesses to regions without devices not
>> generate faults", the simplest way to do that is to
>> leave the ignore_memory_transaction_failures flag set
>> the way it is.
>> Introducing this flag provides a straightforward way to suppress
>> memory access exceptions by checking if the flag is enabled after
>> a CPU memory access failure; however,its primary purpose is to
>> ensure compatibility.
>> Since we can designate unimplemented device memory ranges with
>> "unimplemented-device," this represents a more standard approach in QEMU
>> for managing RAZ/WI behavior.
> I don't think that using a 4GB unimplemented-device is
> a "more standard" way to do this. We have a standard way for
> the board model to say "we don't know whether there might
> be existing guest code out there that relies on being able
> to make accesses to addresses where there should be a device
> but we haven't modeled it". That way is to set the
> ignore_memory_transaction_failures flag.
>
> There are two things we can do:
>
> (1) We can leave the ignore_memory_transaction_failures
> flag set. This is safe (no behaviour change) but not the
> right (matching the hardware) behaviour. The main reason
> to do this is if we don't feel we have enough access to
> a range of guest code to test the other approach.
>
> (2) We can clear the flag. This is preferable (it matches the
> hardware). But the requirement to do this is that
>   (a) we must make the best effort we can to be sure we've
>       put unimplemented-device placeholders for specific
>       devices we don't yet model (by checking e.g. the
>       hardware documentation for the SoC and board model,
>       the device tree, etc)
>   (b) we do the most wide-ranging testing of guest code that
>       we can. This checks that we didn't miss anything in (a).
>
> I don't mind which of these we do. What I was asking in my
> comments on version one of your patch was for how we were
> doing on requirement 2b.
>
> thanks
> -- PMM

I understand! I will provide more comprehensive testing methods
and results as soon as possible and will get back to you.

Best regards,
Chao Liu