diff mbox series

[16/20] xen/arm: mm: Protect Xen page-table update with a spinlock

Message ID 20190422164937.21350-17-julien.grall@arm.com (mailing list archive)
State Superseded
Headers show
Series xen/arm: Clean-up & fixes in boot/mm code | expand

Commit Message

Julien Grall April 22, 2019, 4:49 p.m. UTC
The function create_xen_entries may be concurrently called. So we need
to protect with a spinlock to avoid corruption the page-tables.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 xen/arch/arm/mm.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Andrii Anisov May 3, 2019, 3:59 p.m. UTC | #1
Hello Julien,

On 22.04.19 19:49, Julien Grall wrote:
> The function create_xen_entries may be concurrently called. So we need
> to protect with a spinlock to avoid corruption the page-tables.

The question from me is why didn't we step into this issue before?

> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> ---
>   xen/arch/arm/mm.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index fa0f41bd07..ecde4e34df 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -969,6 +969,8 @@ enum xenmap_operation {
>       RESERVE
>   };
>   
> +static DEFINE_SPINLOCK(xen_pt_lock);
> +
>   static int create_xen_entries(enum xenmap_operation op,
>                                 unsigned long virt,
>                                 mfn_t mfn,
> @@ -980,6 +982,8 @@ static int create_xen_entries(enum xenmap_operation op,
>       lpae_t pte, *entry;
>       lpae_t *third = NULL;
>   
> +    spin_lock(&xen_pt_lock);
> +
>       for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1))
>       {
>           entry = &xen_second[second_linear_offset(addr)];
> @@ -1054,6 +1058,8 @@ out:
>        */
>       flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
>   
> +    spin_unlock(&xen_pt_lock);
> +
>       return rc;
>   }
>   
>
Julien Grall May 3, 2019, 5:19 p.m. UTC | #2
On 03/05/2019 16:59, Andrii Anisov wrote:
> Hello Julien,

Hi,

> 
> On 22.04.19 19:49, Julien Grall wrote:
>> The function create_xen_entries may be concurrently called. So we need
>> to protect with a spinlock to avoid corruption the page-tables.
> 
> The question from me is why didn't we step into this issue before?

TLDR; because xen page-tables are not that often modified after boot. Yet it is 
still possible to race.

At the moment, create_xen_entries() can only modify the VA range 0 - 2GB. In 
that range, we can modify at runtime the VMAP area. One potential issue is
a vmap issued at the same time.

While the range allocation is protected by a lock (see vm_alloc), the mapping is 
not. So it would be possible to end up modifying the page-table at the same. 
That could blow up if for instance, the second-level entry is invalid as we 
would need to allocate memory (only one can win that race).

In general, it is a saner approach to try to serialize the modifications in the 
page-tables. So you can safely read an entry, check it and then update it.

Cheers,
Andrii Anisov May 6, 2019, 8:20 a.m. UTC | #3
Hello Julien,

On 03.05.19 20:19, Julien Grall wrote:
> TLDR; because xen page-tables are not that often modified after boot. Yet it is still possible to race.
> 
> At the moment, create_xen_entries() can only modify the VA range 0 - 2GB. In that range, we can modify at runtime the VMAP area. One potential issue is
> a vmap issued at the same time.
> 
> While the range allocation is protected by a lock (see vm_alloc), the mapping is not. So it would be possible to end up modifying the page-table at the same. That could blow up if for instance, the second-level entry is invalid as we would need to allocate memory (only one can win that race).

I understand the potential race, but still wondering why didn't we see those issues. Maybe we are too lucky.

> In general, it is a saner approach to try to serialize the modifications in the page-tables. So you can safely read an entry, check it and then update it.

Yet, I think we would stick at these locks for now.
Andrii Anisov May 6, 2019, 8:20 a.m. UTC | #4
On 22.04.19 19:49, Julien Grall wrote:
> The function create_xen_entries may be concurrently called. So we need
> to protect with a spinlock to avoid corruption the page-tables.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Julien Grall May 6, 2019, 4:54 p.m. UTC | #5
On 5/6/19 9:20 AM, Andrii Anisov wrote:
> Hello Julien,
> 
> On 03.05.19 20:19, Julien Grall wrote:
>> TLDR; because xen page-tables are not that often modified after boot. 
>> Yet it is still possible to race.
>>
>> At the moment, create_xen_entries() can only modify the VA range 0 - 
>> 2GB. In that range, we can modify at runtime the VMAP area. One 
>> potential issue is
>> a vmap issued at the same time.
>>
>> While the range allocation is protected by a lock (see vm_alloc), the 
>> mapping is not. So it would be possible to end up modifying the 
>> page-table at the same. That could blow up if for instance, the 
>> second-level entry is invalid as we would need to allocate memory 
>> (only one can win that race).
> 
> I understand the potential race, but still wondering why didn't we see 
> those issues. Maybe we are too lucky.

As I pointed out above, the VMAP area is not often updated after boot. 
Furthermore, to hit the race, you need to insert 2 entries covered by 
the same unallocated 3rd-level table at the same time. As the 3rd-level 
table are only allocated once and never released, you probably have more 
chance to win at the lottery over hitting the race.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index fa0f41bd07..ecde4e34df 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -969,6 +969,8 @@  enum xenmap_operation {
     RESERVE
 };
 
+static DEFINE_SPINLOCK(xen_pt_lock);
+
 static int create_xen_entries(enum xenmap_operation op,
                               unsigned long virt,
                               mfn_t mfn,
@@ -980,6 +982,8 @@  static int create_xen_entries(enum xenmap_operation op,
     lpae_t pte, *entry;
     lpae_t *third = NULL;
 
+    spin_lock(&xen_pt_lock);
+
     for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1))
     {
         entry = &xen_second[second_linear_offset(addr)];
@@ -1054,6 +1058,8 @@  out:
      */
     flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
 
+    spin_unlock(&xen_pt_lock);
+
     return rc;
 }