xen/arm and swiotlb-xen: possible data corruption

Message ID	20170302225515.GG23726@toto (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> Received-SPF: Pass (protection.outlook.com: domain of xilinx.com designates 149.199.60.83 as permitted sender) receiver=protection.outlook.com; client-ip=149.199.60.83; helo=xsj-pvapsmtpgw01; Date: Thu, 2 Mar 2017 23:55:15 +0100 From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com> To: Stefano Stabellini <sstabellini@kernel.org> Message-ID: <20170302225515.GG23726@toto> References: <alpine.DEB.2.10.1703011643300.13077@sstabellini-ThinkPad-X260> <20170302083837.GB23726@toto> <20170302085332.GU9606@toto> <06e91a8f-f31d-b016-afdd-c6ef28800b87@arm.com> <alpine.DEB.2.10.1703021106100.2888@sstabellini-ThinkPad-X260> <b4655bfd-080f-5a56-d307-7a06bcd93854@arm.com> <alpine.DEB.2.10.1703021433070.17906@sstabellini-ThinkPad-X260> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <alpine.DEB.2.10.1703021433070.17906@sstabellini-ThinkPad-X260> User-Agent: Mutt/1.5.24 (2015-08-30) SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DM2PR0201MB0718; 7:5KyX+a5z9p2dwRqpNviOscyRvxHtYhyDJ5baa7y5v7guezZbpkscd/Mumd2gWPop6WZE26emD2o4pq4NS7GZ/BZro2jNpcGbwTC/D7GMkeThiTduzBID9JHHTThSh7dqmPODEjYJKtTm1OswEYBTcZ+UzAfpGghZyBcQeRdgJKfe1EkF0Un2Yl2uUSHEChCyuajv/NcMUe4NCO7hWqhShemp4TaArPz+AGHjwJYbH0D6bEcaddEenY2rJVVrDawXEU+DHlVc31z2q+s3WAlFhce7GCjaoqoI5Nl9I3NlPZWeVfGsecMw3UQ1ex1O2u7HWQT16TS6vAxLnYIshojuew== Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>, Julien Grall <julien.grall@arm.com>, nd@arm.com, xen-devel@lists.xenproject.org Subject: Re: [Xen-devel] xen/arm and swiotlb-xen: possible data corruption Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Message ID

20170302225515.GG23726@toto (mailing list archive)

State

New, archived

Headers

Received-SPF: Pass (protection.outlook.com: domain of xilinx.com designates
	149.199.60.83 as permitted sender)
	receiver=protection.outlook.com; 
	client-ip=149.199.60.83; helo=xsj-pvapsmtpgw01;
Date: Thu, 2 Mar 2017 23:55:15 +0100
From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Message-ID: <20170302225515.GG23726@toto>
References: <alpine.DEB.2.10.1703011643300.13077@sstabellini-ThinkPad-X260>
	<20170302083837.GB23726@toto> <20170302085332.GU9606@toto>
	<06e91a8f-f31d-b016-afdd-c6ef28800b87@arm.com>
	<alpine.DEB.2.10.1703021106100.2888@sstabellini-ThinkPad-X260>
	<b4655bfd-080f-5a56-d307-7a06bcd93854@arm.com>
	<alpine.DEB.2.10.1703021433070.17906@sstabellini-ThinkPad-X260>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <alpine.DEB.2.10.1703021433070.17906@sstabellini-ThinkPad-X260>
User-Agent: Mutt/1.5.24 (2015-08-30)
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Mar 2017 22:55:19.9214
	(UTC)
X-MS-Exchange-CrossTenant-Id: 657af505-d5df-48d0-8300-c31994686c5c
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=657af505-d5df-48d0-8300-c31994686c5c;
	Ip=[149.199.60.83]; 
	Helo=[xsj-pvapsmtpgw01]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0201MB0718
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>,
	Julien Grall <julien.grall@arm.com>, nd@arm.com,
	xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] xen/arm and swiotlb-xen: possible data corruption
X-BeenThere: xen-devel@lists.xen.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Xen developer discussion <xen-devel.lists.xen.org>
List-Unsubscribe: <https://lists.xen.org/cgi-bin/mailman/options/xen-devel>, 
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <https://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>, 
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: xen-devel-bounces@lists.xen.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>
X-Virus-Scanned: ClamAV using ClamSMTP

Commit Message

Edgar E. Iglesias March 2, 2017, 10:55 p.m. UTC

On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Julien Grall wrote:
> > Hi Stefano,
> > 
> > On 02/03/17 19:12, Stefano Stabellini wrote:
> > > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > > you look at xen/include/asm-arm/page.h, my patch made
> > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > > consistency, I would prefer to deal with the two functions the same way.
> > > Although it is not a spec requirement, I also think that it is a good
> > > idea to issue cache flushes from cacheline aligned addresses, like
> > > invalidate_dcache_va_range does and Linux does, to make more obvious
> > > what is going on.
> > 
> > invalid_dcache_va_range is split because the cache instruction differs for the
> > start and end if unaligned. For them you want to use clean & invalidate rather
> > than invalidate.
> > 
> > If you look at the implementation of other cache helpers in Linux (see
> > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
> > start & end.
> 
> I don't think so, unless I am reading dcache_by_line_op wrong.
> 
> 
> > Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
> > The modulo in this case will not be optimized by the compiler because
> > cacheline_bytes is not a constant.
> 
> That is a good point. What if I replace the modulo op with
> 
>   p & (cacheline_bytes - 1)
> 
> in invalidate_dcache_va_range, then add the similar code to
> clean_dcache_va_range and clean_and_invalidate_dcache_va_range?


Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:


I think that would achieve the same result as your patch Stefano?

Cheers,
Edgar


> 
> 
> > BTW, you would also need to fix clean_and_invalidate_dcache_va_range.
> 
> I'll do that, thanks for the reminder.

Comments

Stefano Stabellini March 2, 2017, 11:07 p.m. UTC | #1

On Thu, 2 Mar 2017, Edgar E. Iglesias wrote:
> On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
> > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 02/03/17 19:12, Stefano Stabellini wrote:
> > > > On Thu, 2 Mar 2017, Julien Grall wrote:
> > > > > On 02/03/17 08:53, Edgar E. Iglesias wrote:
> > > > > > On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
> > > > > > > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
> > > > Julien, from looking at the two diffs, this is simpler and nicer, but if
> > > > you look at xen/include/asm-arm/page.h, my patch made
> > > > clean_dcache_va_range consistent with invalidate_dcache_va_range. For
> > > > consistency, I would prefer to deal with the two functions the same way.
> > > > Although it is not a spec requirement, I also think that it is a good
> > > > idea to issue cache flushes from cacheline aligned addresses, like
> > > > invalidate_dcache_va_range does and Linux does, to make more obvious
> > > > what is going on.
> > > 
> > > invalid_dcache_va_range is split because the cache instruction differs for the
> > > start and end if unaligned. For them you want to use clean & invalidate rather
> > > than invalidate.
> > > 
> > > If you look at the implementation of other cache helpers in Linux (see
> > > dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
> > > start & end.
> > 
> > I don't think so, unless I am reading dcache_by_line_op wrong.
> > 
> > 
> > > Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
> > > The modulo in this case will not be optimized by the compiler because
> > > cacheline_bytes is not a constant.
> > 
> > That is a good point. What if I replace the modulo op with
> > 
> >   p & (cacheline_bytes - 1)
> > 
> > in invalidate_dcache_va_range, then add the similar code to
> > clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
> 
> 
> Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:
> 
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
>  {
>      const void *end;
>      dsb(sy);           /* So the CPU issues all writes to the range */
> -    for ( end = p + size; p < end; p += cacheline_bytes )
> +
> +    p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
> +    end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);

Even simpler:

   end = p + size;
   p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);


> +    for ( ; p < end; p += cacheline_bytes )
>          asm volatile (__clean_dcache_one(0) : : "r" (p));
>      dsb(sy);           /* So we know the flushes happen before continuing */
>      /* ARM callers assume that dcache_* functions cannot fail. */
> 
> I think that would achieve the same result as your patch Stefano?

Yes, indeed, that's better.

Julien Grall March 2, 2017, 11:24 p.m. UTC | #2

On 02/03/2017 23:07, Stefano Stabellini wrote:
> On Thu, 2 Mar 2017, Edgar E. Iglesias wrote:
>> On Thu, Mar 02, 2017 at 02:39:55PM -0800, Stefano Stabellini wrote:
>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 02/03/17 19:12, Stefano Stabellini wrote:
>>>>> On Thu, 2 Mar 2017, Julien Grall wrote:
>>>>>> On 02/03/17 08:53, Edgar E. Iglesias wrote:
>>>>>>> On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote:
>>>>>>>> On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote:
>>>>> Julien, from looking at the two diffs, this is simpler and nicer, but if
>>>>> you look at xen/include/asm-arm/page.h, my patch made
>>>>> clean_dcache_va_range consistent with invalidate_dcache_va_range. For
>>>>> consistency, I would prefer to deal with the two functions the same way.
>>>>> Although it is not a spec requirement, I also think that it is a good
>>>>> idea to issue cache flushes from cacheline aligned addresses, like
>>>>> invalidate_dcache_va_range does and Linux does, to make more obvious
>>>>> what is going on.
>>>>
>>>> invalid_dcache_va_range is split because the cache instruction differs for the
>>>> start and end if unaligned. For them you want to use clean & invalidate rather
>>>> than invalidate.
>>>>
>>>> If you look at the implementation of other cache helpers in Linux (see
>>>> dcache_by_line_op in arch/arm64/include/asm/assembler.h), they will only align
>>>> start & end.
>>>
>>> I don't think so, unless I am reading dcache_by_line_op wrong.
>>>
>>>
>>>> Also, the invalid_dcache_va_range is using modulo which I would rather avoid.
>>>> The modulo in this case will not be optimized by the compiler because
>>>> cacheline_bytes is not a constant.
>>>
>>> That is a good point. What if I replace the modulo op with
>>>
>>>   p & (cacheline_bytes - 1)
>>>
>>> in invalidate_dcache_va_range, then add the similar code to
>>> clean_dcache_va_range and clean_and_invalidate_dcache_va_range?
>>
>>
>> Yeah, if there was some kind of generic ALIGN or ROUND_DOWN macro we could do:
>>
>> --- a/xen/include/asm-arm/page.h
>> +++ b/xen/include/asm-arm/page.h
>> @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size)
>>  {
>>      const void *end;
>>      dsb(sy);           /* So the CPU issues all writes to the range */
>> -    for ( end = p + size; p < end; p += cacheline_bytes )
>> +
>> +    p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
>> +    end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
>
> Even simpler:
>
>    end = p + size;
>    p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);

We don't have any ALIGN macro in Xen and the way we use the term align 
in xen is very similar to ROUNDUP.

However a simple p = (void *)((uintptr_t)p & ~(cacheline_bytes - 1)) 
should work here.

Cheers,

--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -325,7 +325,9 @@  static inline int clean_dcache_va_range(const void *p, unsigned long size)
 {
     const void *end;
     dsb(sy);           /* So the CPU issues all writes to the range */
-    for ( end = p + size; p < end; p += cacheline_bytes )
+
+    p = (void *)ALIGN((uintptr_t)p, cacheline_bytes);
+    end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes);
+    for ( ; p < end; p += cacheline_bytes )
         asm volatile (__clean_dcache_one(0) : : "r" (p));
     dsb(sy);           /* So we know the flushes happen before continuing */
     /* ARM callers assume that dcache_* functions cannot fail. */

xen/arm and swiotlb-xen: possible data corruption

Commit Message

Comments

Patch