diff mbox series

mips64el-softmmu: Enable MTTCG

Message ID 20240511-mips_mttcg-v1-1-1b71d9b85234@flygoat.com (mailing list archive)
State New, archived
Headers show
Series mips64el-softmmu: Enable MTTCG | expand

Commit Message

Jiaxun Yang May 11, 2024, 8:26 p.m. UTC
MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
for MIPS guests") due to test case instability.

I was able to reproduce this issue with in latest QEMU and look
into reason behind that.

What actually happend is kernel's CP0 timer synchronisation
mechanism assumed a consistent latency in memory access between
cores, which TCG can't guarantee. Thus there is a huge drift in
count value between cores, and in early kernel versions CP0 timer
is always used as sched_clock.

sched_clock drift back on some cores triggered RCU watchdog in
some extreme cases.

This can be resolved by setting clocksource to MIPS, which allows
clocksource to drift together with sched_clock. However this will
leed to other problems after boot.

Another option would beupdating kernel to later version, which
will use GIC as sched_clock.

In non-MTTCG build, the execution is slow enough so kernel won't
observe back drifts.

Test results:

With clocksource=MIPS
```
 ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
    -display none -vga none -serial mon:stdio \
    -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
    -cpu I6400 -smp 8 -vga std \
    -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
    --no-reboot

100, 0, PASS, 5.258126, 100, 100, -
Results summary:
0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
Ran command 100 times, 100 passes
```

With linux-next:
```
 ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
    -display none -vga none -serial mon:stdio \
    -machine malta -kernel ~/linux-next/vmlinux \
    -cpu I6400 -smp 8 -vga std \
    -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
    --no-reboot

100, 0, PASS, 4.507921, 100, 100, -
Results summary:
0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
Ran command 100 times, 100 passes
```

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
I'll leave the test case alone as it's already marked as QEMU_TEST_FLAKY_TESTS
---
 configs/targets/mips64el-softmmu.mak | 1 +
 1 file changed, 1 insertion(+)


---
base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
change-id: 20240511-mips_mttcg-47a6b19074b3

Best regards,

Comments

Philippe Mathieu-Daudé June 3, 2024, 3:26 p.m. UTC | #1
+Alex, Pierrick & Anton

On 11/5/24 22:26, Jiaxun Yang wrote:
> MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
> for MIPS guests") due to test case instability.
> 
> I was able to reproduce this issue with in latest QEMU and look
> into reason behind that.
> 
> What actually happend is kernel's CP0 timer synchronisation
> mechanism assumed a consistent latency in memory access between
> cores, which TCG can't guarantee. Thus there is a huge drift in
> count value between cores, and in early kernel versions CP0 timer
> is always used as sched_clock.
> 
> sched_clock drift back on some cores triggered RCU watchdog in
> some extreme cases.
> 
> This can be resolved by setting clocksource to MIPS, which allows
> clocksource to drift together with sched_clock. However this will
> leed to other problems after boot.
> 
> Another option would beupdating kernel to later version, which
> will use GIC as sched_clock.
> 
> In non-MTTCG build, the execution is slow enough so kernel won't
> observe back drifts.
> 
> Test results:
> 
> With clocksource=MIPS
> ```
>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>      -display none -vga none -serial mon:stdio \
>      -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
>      -cpu I6400 -smp 8 -vga std \
>      -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
>      --no-reboot
> 
> 100, 0, PASS, 5.258126, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
> Ran command 100 times, 100 passes
> ```
> 
> With linux-next:
> ```
>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>      -display none -vga none -serial mon:stdio \
>      -machine malta -kernel ~/linux-next/vmlinux \
>      -cpu I6400 -smp 8 -vga std \
>      -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
>      --no-reboot
> 
> 100, 0, PASS, 4.507921, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
> Ran command 100 times, 100 passes
> ```
> 
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> ---
> I'll leave the test case alone as it's already marked as QEMU_TEST_FLAKY_TESTS
> ---
>   configs/targets/mips64el-softmmu.mak | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
> index 8d9ab3ddc4b1..199b1d909a7d 100644
> --- a/configs/targets/mips64el-softmmu.mak
> +++ b/configs/targets/mips64el-softmmu.mak
> @@ -1,3 +1,4 @@
>   TARGET_ARCH=mips64
>   TARGET_BASE_ARCH=mips
> +TARGET_SUPPORTS_MTTCG=y
>   TARGET_NEED_FDT=y
> 
> ---
> base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
> change-id: 20240511-mips_mttcg-47a6b19074b3
> 
> Best regards,
Pierrick Bouvier June 3, 2024, 6:15 p.m. UTC | #2
On 5/11/24 13:26, Jiaxun Yang wrote:
> MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
> for MIPS guests") due to test case instability.
> 
> I was able to reproduce this issue with in latest QEMU and look
> into reason behind that.
> 
> What actually happend is kernel's CP0 timer synchronisation
> mechanism assumed a consistent latency in memory access between
> cores, which TCG can't guarantee. Thus there is a huge drift in
> count value between cores, and in early kernel versions CP0 timer
> is always used as sched_clock.
> 
> sched_clock drift back on some cores triggered RCU watchdog in
> some extreme cases.
> 
> This can be resolved by setting clocksource to MIPS, which allows
> clocksource to drift together with sched_clock. However this will
> leed to other problems after boot.
> 
> Another option would beupdating kernel to later version, which
> will use GIC as sched_clock.
> 
> In non-MTTCG build, the execution is slow enough so kernel won't
> observe back drifts.
> 
> Test results:
> 
> With clocksource=MIPS
> ```
>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>      -display none -vga none -serial mon:stdio \
>      -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
>      -cpu I6400 -smp 8 -vga std \
>      -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
>      --no-reboot
> 
> 100, 0, PASS, 5.258126, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
> Ran command 100 times, 100 passes
> ```
> 
> With linux-next:
> ```
>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>      -display none -vga none -serial mon:stdio \
>      -machine malta -kernel ~/linux-next/vmlinux \
>      -cpu I6400 -smp 8 -vga std \
>      -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
>      --no-reboot
> 
> 100, 0, PASS, 4.507921, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
> Ran command 100 times, 100 passes
> ```
> 
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> ---
> I'll leave the test case alone as it's already marked as QEMU_TEST_FLAKY_TESTS
> ---
>   configs/targets/mips64el-softmmu.mak | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
> index 8d9ab3ddc4b1..199b1d909a7d 100644
> --- a/configs/targets/mips64el-softmmu.mak
> +++ b/configs/targets/mips64el-softmmu.mak
> @@ -1,3 +1,4 @@
>   TARGET_ARCH=mips64
>   TARGET_BASE_ARCH=mips
> +TARGET_SUPPORTS_MTTCG=y
>   TARGET_NEED_FDT=y
> 
> ---
> base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
> change-id: 20240511-mips_mttcg-47a6b19074b3
> 
> Best regards,

Hi Jiaxun,
Thanks for your analysis!

We should see to update concerned test in another series.
I'm not sure which way is preferred between updating kernel used or 
changing current command line.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Jiaxun Yang June 4, 2024, 10:18 a.m. UTC | #3
在2024年6月3日六月 下午7:15,Pierrick Bouvier写道:
> On 5/11/24 13:26, Jiaxun Yang wrote:
>> MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
>> for MIPS guests") due to test case instability.
>> 
>> I was able to reproduce this issue with in latest QEMU and look
>> into reason behind that.
>> 
>> What actually happend is kernel's CP0 timer synchronisation
>> mechanism assumed a consistent latency in memory access between
>> cores, which TCG can't guarantee. Thus there is a huge drift in
>> count value between cores, and in early kernel versions CP0 timer
>> is always used as sched_clock.
>> 
>> sched_clock drift back on some cores triggered RCU watchdog in
>> some extreme cases.
>> 
>> This can be resolved by setting clocksource to MIPS, which allows
>> clocksource to drift together with sched_clock. However this will
>> leed to other problems after boot.
>> 
>> Another option would beupdating kernel to later version, which
>> will use GIC as sched_clock.
>> 
>> In non-MTTCG build, the execution is slow enough so kernel won't
>> observe back drifts.
>> 
>> Test results:
>> 
>> With clocksource=MIPS
>> ```
>>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>>      -display none -vga none -serial mon:stdio \
>>      -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
>>      -cpu I6400 -smp 8 -vga std \
>>      -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
>>      --no-reboot
>> 
>> 100, 0, PASS, 5.258126, 100, 100, -
>> Results summary:
>> 0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
>> Ran command 100 times, 100 passes
>> ```
>> 
>> With linux-next:
>> ```
>>   ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>>      -display none -vga none -serial mon:stdio \
>>      -machine malta -kernel ~/linux-next/vmlinux \
>>      -cpu I6400 -smp 8 -vga std \
>>      -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
>>      --no-reboot
>> 
>> 100, 0, PASS, 4.507921, 100, 100, -
>> Results summary:
>> 0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
>> Ran command 100 times, 100 passes
>> ```
>> 
>> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
>> ---
>> I'll leave the test case alone as it's already marked as QEMU_TEST_FLAKY_TESTS
>> ---
>>   configs/targets/mips64el-softmmu.mak | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
>> index 8d9ab3ddc4b1..199b1d909a7d 100644
>> --- a/configs/targets/mips64el-softmmu.mak
>> +++ b/configs/targets/mips64el-softmmu.mak
>> @@ -1,3 +1,4 @@
>>   TARGET_ARCH=mips64
>>   TARGET_BASE_ARCH=mips
>> +TARGET_SUPPORTS_MTTCG=y
>>   TARGET_NEED_FDT=y
>> 
>> ---
>> base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
>> change-id: 20240511-mips_mttcg-47a6b19074b3
>> 
>> Best regards,
>
> Hi Jiaxun,
> Thanks for your analysis!
>
> We should see to update concerned test in another series.
> I'm not sure which way is preferred between updating kernel used or 
> changing current command line.
>
> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

I think updating kernel can prevent further problems.
Will test with latest Debian kernel.

Thanks!
Alex Bennée June 4, 2024, 12:43 p.m. UTC | #4
Jiaxun Yang <jiaxun.yang@flygoat.com> writes:

> MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
> for MIPS guests") due to test case instability.
>
> I was able to reproduce this issue with in latest QEMU and look
> into reason behind that.
>
> What actually happend is kernel's CP0 timer synchronisation
> mechanism assumed a consistent latency in memory access between
> cores, which TCG can't guarantee. Thus there is a huge drift in
> count value between cores, and in early kernel versions CP0 timer
> is always used as sched_clock.
>
> sched_clock drift back on some cores triggered RCU watchdog in
> some extreme cases.
>
> This can be resolved by setting clocksource to MIPS, which allows
> clocksource to drift together with sched_clock. However this will
> leed to other problems after boot.
>
> Another option would beupdating kernel to later version, which
> will use GIC as sched_clock.

I'm a little uncomfortable about making the success of MTTCG depend on
the user options passed to the kernel.

How come a non-icount run depends on how many instructions are being run
on each core? Why is time not based on the wallclock time and by
definition synced between all cores?

> In non-MTTCG build, the execution is slow enough so kernel won't
> observe back drifts.
>
> Test results:
>
> With clocksource=MIPS
> ```
>  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>     -display none -vga none -serial mon:stdio \
>     -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
>     -cpu I6400 -smp 8 -vga std \
>     -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
>     --no-reboot
>
> 100, 0, PASS, 5.258126, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
> Ran command 100 times, 100 passes
> ```
>
> With linux-next:
> ```
>  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>     -display none -vga none -serial mon:stdio \
>     -machine malta -kernel ~/linux-next/vmlinux \
>     -cpu I6400 -smp 8 -vga std \
>     -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
>     --no-reboot
>
> 100, 0, PASS, 4.507921, 100, 100, -
> Results summary:
> 0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
> Ran command 100 times, 100 passes
> ```
>
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> ---
> I'll leave the test case alone as it's already marked as
> QEMU_TEST_FLAKY_TESTS

We should ensure we have some testing enabled if we switch this on. Was
the FLAKY test marked for MTTCG issues or something else?

> ---
>  configs/targets/mips64el-softmmu.mak | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
> index 8d9ab3ddc4b1..199b1d909a7d 100644
> --- a/configs/targets/mips64el-softmmu.mak
> +++ b/configs/targets/mips64el-softmmu.mak
> @@ -1,3 +1,4 @@
>  TARGET_ARCH=mips64
>  TARGET_BASE_ARCH=mips
> +TARGET_SUPPORTS_MTTCG=y
>  TARGET_NEED_FDT=y
>
> ---
> base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
> change-id: 20240511-mips_mttcg-47a6b19074b3
>
> Best regards,
Jiaxun Yang June 5, 2024, 12:47 p.m. UTC | #5
在2024年6月4日六月 下午1:43,Alex Bennée写道:
[...]
>
> I'm a little uncomfortable about making the success of MTTCG depend on
> the user options passed to the kernel.

Yes, I agree, it sounds awkward.

>
> How come a non-icount run depends on how many instructions are being run
> on each core? Why is time not based on the wallclock time and by
> definition synced between all cores?

That's the problem of Guest Kernel, the synchronisation mechanism is poorly
written that relies on instruction latency too much. I have patch pending to
improve it at guest side, but we can't fix it in existing binary.

This clock, CP0_Counter, as per ISA spec, is independent between cores and
needs to be synchronized by software. We can emulate it as a monotonic clock
source and Linux should be happy with it, but that is violating ISA spec.

>
>> In non-MTTCG build, the execution is slow enough so kernel won't
>> observe back drifts.
>>
>> Test results:
>>
>> With clocksource=MIPS
>> ```
>>  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>>     -display none -vga none -serial mon:stdio \
>>     -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
>>     -cpu I6400 -smp 8 -vga std \
>>     -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 panic=-1" \
>>     --no-reboot
>>
>> 100, 0, PASS, 5.258126, 100, 100, -
>> Results summary:
>> 0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
>> Ran command 100 times, 100 passes
>> ```
>>
>> With linux-next:
>> ```
>>  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
>>     -display none -vga none -serial mon:stdio \
>>     -machine malta -kernel ~/linux-next/vmlinux \
>>     -cpu I6400 -smp 8 -vga std \
>>     -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
>>     --no-reboot
>>
>> 100, 0, PASS, 4.507921, 100, 100, -
>> Results summary:
>> 0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
>> Ran command 100 times, 100 passes
>> ```
>>
>> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
>> ---
>> I'll leave the test case alone as it's already marked as
>> QEMU_TEST_FLAKY_TESTS
>
> We should ensure we have some testing enabled if we switch this on. Was
> the FLAKY test marked for MTTCG issues or something else?

Tracking history, I think it was because this issue is still popping up after
disabling MTTCG.

I'll try to upgrade kernel used in this test.

Thanks
- Jiaxun
>
>> ---
>>  configs/targets/mips64el-softmmu.mak | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
>> index 8d9ab3ddc4b1..199b1d909a7d 100644
>> --- a/configs/targets/mips64el-softmmu.mak
>> +++ b/configs/targets/mips64el-softmmu.mak
>> @@ -1,3 +1,4 @@
>>  TARGET_ARCH=mips64
>>  TARGET_BASE_ARCH=mips
>> +TARGET_SUPPORTS_MTTCG=y
>>  TARGET_NEED_FDT=y
>>
>> ---
>> base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
>> change-id: 20240511-mips_mttcg-47a6b19074b3
>>
>> Best regards,
>
> -- 
> Alex Bennée
> Virtualisation Tech Lead @ Linaro
Jiaxun Yang June 7, 2024, 10:49 a.m. UTC | #6
在2024年6月5日六月 下午1:47,Jiaxun Yang写道:
[...]
> I'll try to upgrade kernel used in this test.

Unfortunately, we don't have kernel build from reliable source
for that test.

I requested Debian project to build kernel for us[1].

[1]: https://salsa.debian.org/kernel-team/linux/-/merge_requests/1074

Thanks
>
> Thanks
> - Jiaxun
>>
[...]
>
> -- 
> - Jiaxun
diff mbox series

Patch

diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
index 8d9ab3ddc4b1..199b1d909a7d 100644
--- a/configs/targets/mips64el-softmmu.mak
+++ b/configs/targets/mips64el-softmmu.mak
@@ -1,3 +1,4 @@ 
 TARGET_ARCH=mips64
 TARGET_BASE_ARCH=mips
+TARGET_SUPPORTS_MTTCG=y
 TARGET_NEED_FDT=y