[0/3] irqchip/gic-v4: Fix VMAPP/VMOVP races

Message ID	20240705093155.871070-1-maz@kernel.org (mailing list archive)
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Marc Zyngier <maz@kernel.org> To: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de>, Nianyao Tang <tangnianyao@huawei.com> Subject: [PATCH 0/3] irqchip/gic-v4: Fix VMAPP/VMOVP races Date: Fri, 5 Jul 2024 10:31:52 +0100 Message-Id: <20240705093155.871070-1-maz@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	irqchip/gic-v4: Fix VMAPP/VMOVP races \| expand [0/3] irqchip/gic-v4: Fix VMAPP/VMOVP races [1/3] irqchip/gic-v4: Always configure affinity on VPE activation [2/3] irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock [3/3] irqchip/gic-v4: Make sure a VPE is locked when VMAPP is issued

Message ID

20240705093155.871070-1-maz@kernel.org (mailing list archive)

Headers

From: Marc Zyngier <maz@kernel.org>
To: kvmarm@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Nianyao Tang <tangnianyao@huawei.com>
Subject: [PATCH 0/3] irqchip/gic-v4: Fix VMAPP/VMOVP races
Date: Fri,  5 Jul 2024 10:31:52 +0100
Message-Id: <20240705093155.871070-1-maz@kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

irqchip/gic-v4: Fix VMAPP/VMOVP races | expand

Message

Marc Zyngier July 5, 2024, 9:31 a.m. UTC

In 20240701072305.4129823-1-tangnianyao@huawei.com, Nianyao reports
a number of possible races that can trigger on GICv4 implementations
using the ITSList feature.

These races involve concurrent VMOVP and VMAPP, the former happening
on vcpu load, while the latter is triggered on the first device being
MAPTI'd on a given ITS for this particular guest.

The main issue is that we want to establish the affinity at VMAPP time,
while vcpu_load wants to set the affinity where the vcpu actually runs.
Lock ordering constraints mean that we can't lock the VPE on VMAPP,
so things are modified without any lock. What could possibly go wrong?

THe fix is a bit involved, and relies on 3 things:

- Making sure that the initial affinity of a VPE is fixed at activate
  time, which is done early in the life of the vcpup, before it can run.

- Add a per-VM lock that can be taken instead of the global vmovp_lock,
  paving the way for a more manageable lock order.

- Take the per-VPE lock whenever modifying the VPE affinity, as expected
  everywhere else in the code.

With that, VMAPP and VMOVP can now run concurrently and still lead to
sensible results.

Marc Zyngier (3):
  irqchip/gic-v4: Always configure affinity on VPE activation
  irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock
  irqchip/gic-v4: Make sure a VPE is locked when VMAPP is issued

 drivers/irqchip/irq-gic-v3-its.c   | 48 ++++++++++++++----------------
 include/linux/irqchip/arm-gic-v4.h |  8 +++++
 2 files changed, 30 insertions(+), 26 deletions(-)

Comments

Tangnianyao July 8, 2024, 2:02 a.m. UTC | #1

On 7/5/2024 17:31, Marc Zyngier wrote:
> In 20240701072305.4129823-1-tangnianyao@huawei.com, Nianyao reports
> a number of possible races that can trigger on GICv4 implementations
> using the ITSList feature.
>
> These races involve concurrent VMOVP and VMAPP, the former happening
> on vcpu load, while the latter is triggered on the first device being
> MAPTI'd on a given ITS for this particular guest.
>
> The main issue is that we want to establish the affinity at VMAPP time,
> while vcpu_load wants to set the affinity where the vcpu actually runs.
> Lock ordering constraints mean that we can't lock the VPE on VMAPP,
> so things are modified without any lock. What could possibly go wrong?
>
> THe fix is a bit involved, and relies on 3 things:
>
> - Making sure that the initial affinity of a VPE is fixed at activate
>   time, which is done early in the life of the vcpup, before it can run.
>
> - Add a per-VM lock that can be taken instead of the global vmovp_lock,
>   paving the way for a more manageable lock order.
>
> - Take the per-VPE lock whenever modifying the VPE affinity, as expected
>   everywhere else in the code.
>
> With that, VMAPP and VMOVP can now run concurrently and still lead to
> sensible results.
>
> Marc Zyngier (3):
>   irqchip/gic-v4: Always configure affinity on VPE activation
>   irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock
>   irqchip/gic-v4: Make sure a VPE is locked when VMAPP is issued
>
>  drivers/irqchip/irq-gic-v3-its.c   | 48 ++++++++++++++----------------
>  include/linux/irqchip/arm-gic-v4.h |  8 +++++
>  2 files changed, 30 insertions(+), 26 deletions(-)
>

I've tested this patch series. It works.

Tested-by: Nianyao Tang <tangnianyao@huawei.com>

Tangnianyao July 17, 2024, 8:41 a.m. UTC | #2

Hi, Marc

I meet another problem while fixing this in kernel 5.10.

Kernel 5.10 does not support guard and we replace it with raw_spin_lock/unlock.

When guest insmod nic drivers, it trigger host cpu power off somehow. The same

testcase runs quite good in kernel 6.6(both host and guest).

Would you fix this on long-term kernel 5.10? Or any sugguestion?


Thanks for your help.

Nianyao Tang



On 7/8/2024 10:02, Tangnianyao wrote:
>
> On 7/5/2024 17:31, Marc Zyngier wrote:
>> In 20240701072305.4129823-1-tangnianyao@huawei.com, Nianyao reports
>> a number of possible races that can trigger on GICv4 implementations
>> using the ITSList feature.
>>
>> These races involve concurrent VMOVP and VMAPP, the former happening
>> on vcpu load, while the latter is triggered on the first device being
>> MAPTI'd on a given ITS for this particular guest.
>>
>> The main issue is that we want to establish the affinity at VMAPP time,
>> while vcpu_load wants to set the affinity where the vcpu actually runs.
>> Lock ordering constraints mean that we can't lock the VPE on VMAPP,
>> so things are modified without any lock. What could possibly go wrong?
>>
>> THe fix is a bit involved, and relies on 3 things:
>>
>> - Making sure that the initial affinity of a VPE is fixed at activate
>>   time, which is done early in the life of the vcpup, before it can run.
>>
>> - Add a per-VM lock that can be taken instead of the global vmovp_lock,
>>   paving the way for a more manageable lock order.
>>
>> - Take the per-VPE lock whenever modifying the VPE affinity, as expected
>>   everywhere else in the code.
>>
>> With that, VMAPP and VMOVP can now run concurrently and still lead to
>> sensible results.
>>
>> Marc Zyngier (3):
>>   irqchip/gic-v4: Always configure affinity on VPE activation
>>   irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock
>>   irqchip/gic-v4: Make sure a VPE is locked when VMAPP is issued
>>
>>  drivers/irqchip/irq-gic-v3-its.c   | 48 ++++++++++++++----------------
>>  include/linux/irqchip/arm-gic-v4.h |  8 +++++
>>  2 files changed, 30 insertions(+), 26 deletions(-)
>>
> I've tested this patch series. It works.
>
> Tested-by: Nianyao Tang <tangnianyao@huawei.com>
>
> .
>

Marc Zyngier July 17, 2024, 9:21 a.m. UTC | #3

On Wed, 17 Jul 2024 09:41:30 +0100,
Tangnianyao <tangnianyao@huawei.com> wrote:
> 
> Hi, Marc
> 
> I meet another problem while fixing this in kernel 5.10.
> 
> Kernel 5.10 does not support guard and we replace it with raw_spin_lock/unlock.
> When guest insmod nic drivers, it trigger host cpu power off somehow. The same
> testcase runs quite good in kernel 6.6(both host and guest).

This suggests that you got the locking order wrong, that one or
several CPUs deadlock, and that a watchdog fires.

> Would you fix this on long-term kernel 5.10? Or any sugguestion?

Sorry, but I have no intention to maintain such an ancient kernel,
which is why I did not Cc stable on these patches. I have no bandwidth
for this, nor any interest in it.

	M.