diff mbox

xen: fix ring resize of /dev/evtchn

Message ID 572A0F0C02000078000E88B6@prv-mh.provo.novell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Beulich May 4, 2016, 1:02 p.m. UTC
The copying of ring data was wrong for two cases: For a full ring
nothing got copied at all (as in that case the canonicalized producer
and consumer indexes are identical). And in case one or both of the
canonicalized (after the resize) indexes would point into the second
half of the buffer, the copied data ended up in the wrong (free) part
of the new buffer. In both cases uninitialized data would get passed
back to the caller.

Fix this by simply copying the old ring contents twice: Once to the
low half of the new buffer, and a second time to the high half.

This addresses the inability to boot a HVM guest with 64 or more
vCPU-s, which was reported by Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
 drivers/xen/evtchn.c |   22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

Comments

David Vrabel May 4, 2016, 1:30 p.m. UTC | #1
On 04/05/16 14:02, Jan Beulich wrote:
> The copying of ring data was wrong for two cases: For a full ring
> nothing got copied at all (as in that case the canonicalized producer
> and consumer indexes are identical). And in case one or both of the
> canonicalized (after the resize) indexes would point into the second
> half of the buffer, the copied data ended up in the wrong (free) part
> of the new buffer. In both cases uninitialized data would get passed
> back to the caller.
> 
> Fix this by simply copying the old ring contents twice: Once to the
> low half of the new buffer, and a second time to the high half.
> 
> This addresses the inability to boot a HVM guest with 64 or more
> vCPU-s, which was reported by Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>.
[...]

Can you include the commit that introduced this regression and which
kernel versions it affects as this is a stable candidate.

> @@ -344,22 +343,13 @@ static int evtchn_resize_ring(struct per
>  	spin_lock_irq(&u->ring_prod_lock);
>  
>  	/*
> -	 * Copy the old ring contents to the new ring.
> -	 *
> -	 * If the ring contents crosses the end of the current ring,
> -	 * it needs to be copied in two chunks.
> -	 *
> -	 * +---------+    +------------------+
> -	 * |34567  12| -> |       1234567    |
> -	 * +-----p-c-+    +------------------+
> +	 * Copy the old ring contents to the new ring. To take care of
> +	 * wrapping, a full ring, and the new canonicalized index pointing
> +	 * into the second half, simply copy the old contents twice.

Could you keep the ascii art?

e.g.,

 * +---------+    +------------------+
 * |34567  12| -> |34567  1234567  12|
 * +-----p-c-+    +-------c------p---+

So it is obvious that the double copy does the right thing.

Thanks.

David
David Vrabel May 4, 2016, 3:34 p.m. UTC | #2
On 04/05/16 14:30, David Vrabel wrote:
> On 04/05/16 14:02, Jan Beulich wrote:
>> The copying of ring data was wrong for two cases: For a full ring
>> nothing got copied at all (as in that case the canonicalized producer
>> and consumer indexes are identical). And in case one or both of the
>> canonicalized (after the resize) indexes would point into the second
>> half of the buffer, the copied data ended up in the wrong (free) part
>> of the new buffer. In both cases uninitialized data would get passed
>> back to the caller.
>>
>> Fix this by simply copying the old ring contents twice: Once to the
>> low half of the new buffer, and a second time to the high half.
>>
>> This addresses the inability to boot a HVM guest with 64 or more
>> vCPU-s, which was reported by Konrad Rzeszutek Wilk
>> <konrad.wilk@oracle.com>.
> [...]
> 
> Can you include the commit that introduced this regression and which
> kernel versions it affects as this is a stable candidate.
> 
>> @@ -344,22 +343,13 @@ static int evtchn_resize_ring(struct per
>>  	spin_lock_irq(&u->ring_prod_lock);
>>  
>>  	/*
>> -	 * Copy the old ring contents to the new ring.
>> -	 *
>> -	 * If the ring contents crosses the end of the current ring,
>> -	 * it needs to be copied in two chunks.
>> -	 *
>> -	 * +---------+    +------------------+
>> -	 * |34567  12| -> |       1234567    |
>> -	 * +-----p-c-+    +------------------+
>> +	 * Copy the old ring contents to the new ring. To take care of
>> +	 * wrapping, a full ring, and the new canonicalized index pointing
>> +	 * into the second half, simply copy the old contents twice.
> 
> Could you keep the ascii art?
> 
> e.g.,
> 
>  * +---------+    +------------------+
>  * |34567  12| -> |34567  1234567  12|
>  * +-----p-c-+    +-------c------p---+
> 
> So it is obvious that the double copy does the right thing.

Never mind, I wanted to send a pull request so I've fixed this up myself.

David
Jan Beulich May 4, 2016, 3:42 p.m. UTC | #3
>>> On 04.05.16 at 17:34, <david.vrabel@citrix.com> wrote:
> On 04/05/16 14:30, David Vrabel wrote:
>> On 04/05/16 14:02, Jan Beulich wrote:
>>> The copying of ring data was wrong for two cases: For a full ring
>>> nothing got copied at all (as in that case the canonicalized producer
>>> and consumer indexes are identical). And in case one or both of the
>>> canonicalized (after the resize) indexes would point into the second
>>> half of the buffer, the copied data ended up in the wrong (free) part
>>> of the new buffer. In both cases uninitialized data would get passed
>>> back to the caller.
>>>
>>> Fix this by simply copying the old ring contents twice: Once to the
>>> low half of the new buffer, and a second time to the high half.
>>>
>>> This addresses the inability to boot a HVM guest with 64 or more
>>> vCPU-s, which was reported by Konrad Rzeszutek Wilk
>>> <konrad.wilk@oracle.com>.
>> [...]
>> 
>> Can you include the commit that introduced this regression and which
>> kernel versions it affects as this is a stable candidate.
>> 
>>> @@ -344,22 +343,13 @@ static int evtchn_resize_ring(struct per
>>>  	spin_lock_irq(&u->ring_prod_lock);
>>>  
>>>  	/*
>>> -	 * Copy the old ring contents to the new ring.
>>> -	 *
>>> -	 * If the ring contents crosses the end of the current ring,
>>> -	 * it needs to be copied in two chunks.
>>> -	 *
>>> -	 * +---------+    +------------------+
>>> -	 * |34567  12| -> |       1234567    |
>>> -	 * +-----p-c-+    +------------------+
>>> +	 * Copy the old ring contents to the new ring. To take care of
>>> +	 * wrapping, a full ring, and the new canonicalized index pointing
>>> +	 * into the second half, simply copy the old contents twice.
>> 
>> Could you keep the ascii art?
>> 
>> e.g.,
>> 
>>  * +---------+    +------------------+
>>  * |34567  12| -> |34567  1234567  12|
>>  * +-----p-c-+    +-------c------p---+
>> 
>> So it is obvious that the double copy does the right thing.
> 
> Never mind, I wanted to send a pull request so I've fixed this up myself.

Oh, sorry, I had it ready but didn't want to send a v2 a few minutes
after the v1. Thanks for taking care of it!

Jan
diff mbox

Patch

--- 4.6-rc6/drivers/xen/evtchn.c
+++ 4.6-rc6-xen-dev-evtchn-resize/drivers/xen/evtchn.c
@@ -316,7 +316,6 @@  static int evtchn_resize_ring(struct per
 {
 	unsigned int new_size;
 	evtchn_port_t *new_ring, *old_ring;
-	unsigned int p, c;
 
 	/*
 	 * Ensure the ring is large enough to capture all possible
@@ -344,22 +343,13 @@  static int evtchn_resize_ring(struct per
 	spin_lock_irq(&u->ring_prod_lock);
 
 	/*
-	 * Copy the old ring contents to the new ring.
-	 *
-	 * If the ring contents crosses the end of the current ring,
-	 * it needs to be copied in two chunks.
-	 *
-	 * +---------+    +------------------+
-	 * |34567  12| -> |       1234567    |
-	 * +-----p-c-+    +------------------+
+	 * Copy the old ring contents to the new ring. To take care of
+	 * wrapping, a full ring, and the new canonicalized index pointing
+	 * into the second half, simply copy the old contents twice.
 	 */
-	p = evtchn_ring_offset(u, u->ring_prod);
-	c = evtchn_ring_offset(u, u->ring_cons);
-	if (p < c) {
-		memcpy(new_ring + c, u->ring + c, (u->ring_size - c) * sizeof(*u->ring));
-		memcpy(new_ring + u->ring_size, u->ring, p * sizeof(*u->ring));
-	} else
-		memcpy(new_ring + c, u->ring + c, (p - c) * sizeof(*u->ring));
+	memcpy(new_ring, old_ring, u->ring_size * sizeof(*u->ring));
+	memcpy(new_ring + u->ring_size, old_ring,
+	       u->ring_size * sizeof(*u->ring));
 
 	u->ring = new_ring;
 	u->ring_size = new_size;