diff mbox

[BUG/RFC] vhost: net: big endian viring access despite virtio 1

Message ID a08f491a-1320-607e-f450-d32ce2c9e2f5@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Halil Pasic Jan. 26, 2017, 5:39 p.m. UTC
Hi!

Recently I have been investigating some strange migration problems on
s390x.

It turned out under certain circumstances vhost_net corrupts avail.idx by
using wrong endianness.

I managed to track the problem down (I'm pretty sure). It boils down to
the following.

When stopping vhost userspace (QEMU) calls vhost_net_set_backend with
the fd argument set to -1, this leads to is_le being reset in
vhost_vq_init_access.  On a BE system resetting to legacy means resetting
to BE. Usually this is not a problem, but in the case when oldubufs is not
zero (which is not likely if no network stress applied) it is a problem.
That code path needs to write avail.idx, and ends up using wrong
endianness when doing that (but only on a BE system).

That is the story in prose, now let's see the corresponding code annotated
with some comments.

from drivers/vhost/net.c:
static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
{
/* [..] some not too interesting stuff  */
        sock = get_socket(fd);
/* fd == -1 --> sock == NULL */
        if (IS_ERR(sock)) {
                r = PTR_ERR(sock);
                goto err_vq;
        }

        /* start polling new socket */
        oldsock = vq->private_data;
        if (sock != oldsock) {
                ubufs = vhost_net_ubuf_alloc(vq,
                                             sock && vhost_sock_zcopy(sock));
                if (IS_ERR(ubufs)) {
                        r = PTR_ERR(ubufs);
                        goto err_ubufs;
                }

                vhost_net_disable_vq(n, vq);
==>             vq->private_data = sock; 
/* now vq->private_data is NULL */
==>             r = vhost_vq_init_access(vq);
                if (r)
                        goto err_used;
/* vq endianness has been reset to BE on s390 */
                r = vhost_net_enable_vq(n, vq);
                if (r)
                        goto err_used;

==>             oldubufs = nvq->ubufs;
/* here oldubufs might become != 0 */               
                nvq->ubufs = ubufs;

                n->tx_packets = 0;
                n->tx_zcopy_err = 0;
                n->tx_flush = false;
        }
        mutex_unlock(&vq->mutex);

        if (oldubufs) {
                vhost_net_ubuf_put_wait_and_free(oldubufs);
                mutex_lock(&vq->mutex);
==>             vhost_zerocopy_signal_used(n, vq);
/* tries to update virtqueue structures; endianness is BE on s390
 * now (but should be LE for virtio-1) */
                mutex_unlock(&vq->mutex);
        }
/*[..] rest of the function */
}


from drivers/vhost/vhost.c:

int vhost_vq_init_access(struct vhost_virtqueue *vq)
{
        __virtio16 last_used_idx;
        int r;
        bool is_le = vq->is_le;

        if (!vq->private_data) {
==>             vhost_reset_is_le(vq);
/* resets to native endianness and returns */
                return 0;
        }

==>      vhost_init_is_le(vq);
/* here we init is_le */

        r = vhost_update_used_flags(vq);
        if (r)
                goto err;
        vq->signalled_used_valid = false;
        if (!vq->iotlb &&
            !access_ok(VERIFY_READ, &vq->used->idx, sizeof vq->used->idx)) {
                r = -EFAULT;
                goto err;
        }
        r = vhost_get_user(vq, last_used_idx, &vq->used->idx);
        if (r) {
                vq_err(vq, "Can't access used idx at %p\n",
                       &vq->used->idx);
                goto err;
        }
        vq->last_used_idx = vhost16_to_cpu(vq, last_used_idx);
        return 0;

err:
        vq->is_le = is_le;
        return r;
}

AFAIU this can be fixed very simply by omitting the reset. Below the
patch. I'm not sure though, the endianness handling ain't simple in
vhost. Am I going in the right direction?

We have a test (on s390x only) running for a couple of hours now and so
far so good but it's a bit early to announce it is tested  for s390x.
If the patch is reasonable, I'm intend to do a version with proper
attribution and stuff.

By the way the reset was first introduced by 
https://lkml.org/lkml/2015/4/10/226 (dug it up in the hope that reasons
for the reset were discussed -- but no enlightenment for me).

Finally I would like to credit Dave Gilbert for hinting that the
unreasonable avail.idx may be due to an endianness problem and Michael A.
Tebolt for reporting the bug and testing.

-------------------------8<--------------
From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001
From: Halil Pasic <pasic@linux.vnet.ibm.com>
Date: Thu, 26 Jan 2017 00:06:15 +0100
Subject: [PATCH] vhost: remove useless/dangerous reset of is_le

The reset of is_le does no good, but it contributes its fair share to a
bug in vhost_net, which occurs if we have some oldubufs when stopping and
setting a fd = -1 as a backend. Instead of doing something convoluted in
vhost_net, let's just get rid of the reset.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: commit 2751c9882b94 
---
 drivers/vhost/vhost.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Michael S. Tsirkin Jan. 26, 2017, 7:20 p.m. UTC | #1
On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:
> 
> Hi!
> 
> Recently I have been investigating some strange migration problems on
> s390x.
> 
> It turned out under certain circumstances vhost_net corrupts avail.idx by
> using wrong endianness.
> 
> I managed to track the problem down (I'm pretty sure). It boils down to
> the following.
> 
> When stopping vhost userspace (QEMU) calls vhost_net_set_backend with
> the fd argument set to -1, this leads to is_le being reset in
> vhost_vq_init_access.  On a BE system resetting to legacy means resetting
> to BE. Usually this is not a problem, but in the case when oldubufs is not
> zero (which is not likely if no network stress applied) it is a problem.
> That code path needs to write avail.idx, and ends up using wrong
> endianness when doing that (but only on a BE system).
> 
> That is the story in prose, now let's see the corresponding code annotated
> with some comments.
> 
> from drivers/vhost/net.c:
> static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> {
> /* [..] some not too interesting stuff  */
>         sock = get_socket(fd);
> /* fd == -1 --> sock == NULL */
>         if (IS_ERR(sock)) {
>                 r = PTR_ERR(sock);
>                 goto err_vq;
>         }
> 
>         /* start polling new socket */
>         oldsock = vq->private_data;
>         if (sock != oldsock) {
>                 ubufs = vhost_net_ubuf_alloc(vq,
>                                              sock && vhost_sock_zcopy(sock));
>                 if (IS_ERR(ubufs)) {
>                         r = PTR_ERR(ubufs);
>                         goto err_ubufs;
>                 }
> 
>                 vhost_net_disable_vq(n, vq);
> ==>             vq->private_data = sock; 
> /* now vq->private_data is NULL */
> ==>             r = vhost_vq_init_access(vq);
>                 if (r)
>                         goto err_used;
> /* vq endianness has been reset to BE on s390 */
>                 r = vhost_net_enable_vq(n, vq);
>                 if (r)
>                         goto err_used;
> 
> ==>             oldubufs = nvq->ubufs;
> /* here oldubufs might become != 0 */               
>                 nvq->ubufs = ubufs;
> 
>                 n->tx_packets = 0;
>                 n->tx_zcopy_err = 0;
>                 n->tx_flush = false;
>         }
>         mutex_unlock(&vq->mutex);
> 
>         if (oldubufs) {
>                 vhost_net_ubuf_put_wait_and_free(oldubufs);
>                 mutex_lock(&vq->mutex);
> ==>             vhost_zerocopy_signal_used(n, vq);
> /* tries to update virtqueue structures; endianness is BE on s390
>  * now (but should be LE for virtio-1) */
>                 mutex_unlock(&vq->mutex);
>         }
> /*[..] rest of the function */
> }

> 
> from drivers/vhost/vhost.c:
> 
> int vhost_vq_init_access(struct vhost_virtqueue *vq)
> {
>         __virtio16 last_used_idx;
>         int r;
>         bool is_le = vq->is_le;
> 
>         if (!vq->private_data) {
> ==>             vhost_reset_is_le(vq);
> /* resets to native endianness and returns */
>                 return 0;
>         }
> 
> ==>      vhost_init_is_le(vq);
> /* here we init is_le */
> 
>         r = vhost_update_used_flags(vq);
>         if (r)
>                 goto err;
>         vq->signalled_used_valid = false;
>         if (!vq->iotlb &&
>             !access_ok(VERIFY_READ, &vq->used->idx, sizeof vq->used->idx)) {
>                 r = -EFAULT;
>                 goto err;
>         }
>         r = vhost_get_user(vq, last_used_idx, &vq->used->idx);
>         if (r) {
>                 vq_err(vq, "Can't access used idx at %p\n",
>                        &vq->used->idx);
>                 goto err;
>         }
>         vq->last_used_idx = vhost16_to_cpu(vq, last_used_idx);
>         return 0;
> 
> err:
>         vq->is_le = is_le;
>         return r;
> }
> 
> AFAIU this can be fixed very simply by omitting the reset. Below the
> patch. I'm not sure though, the endianness handling ain't simple in
> vhost. Am I going in the right direction?
> 
> We have a test (on s390x only) running for a couple of hours now and so
> far so good but it's a bit early to announce it is tested  for s390x.
> If the patch is reasonable, I'm intend to do a version with proper
> attribution and stuff.
> 
> By the way the reset was first introduced by 
> https://lkml.org/lkml/2015/4/10/226 (dug it up in the hope that reasons
> for the reset were discussed -- but no enlightenment for me).
> 
> Finally I would like to credit Dave Gilbert for hinting that the
> unreasonable avail.idx may be due to an endianness problem and Michael A.
> Tebolt for reporting the bug and testing.
> 
> -------------------------8<--------------
> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001
> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> Date: Thu, 26 Jan 2017 00:06:15 +0100
> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
> 
> The reset of is_le does no good, but it contributes its fair share to a
> bug in vhost_net, which occurs if we have some oldubufs when stopping and
> setting a fd = -1 as a backend. Instead of doing something convoluted in
> vhost_net, let's just get rid of the reset.
> 
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Fixes: commit 2751c9882b94 
> ---
>  drivers/vhost/vhost.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index d643260..08072a2 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>         int r;
>         bool is_le = vq->is_le;
> 
> -       if (!vq->private_data) {
> -               vhost_reset_is_le(vq);
> +       if (!vq->private_data)
>                 return 0;
> -       }
> 
>         vhost_init_is_le(vq);


I think you do need to reset it, just maybe within vhost_init_is_le.

        if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
                vq->is_le = true;
	else
		vhost_reset_is_le(vq);



> -- 
> 2.8.4
Halil Pasic Jan. 27, 2017, 12:24 p.m. UTC | #2
On 01/26/2017 08:20 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:
>>
>> Hi!
>>
>> Recently I have been investigating some strange migration problems on
>> s390x.
>>
>> It turned out under certain circumstances vhost_net corrupts avail.idx by
>> using wrong endianness.

[..]

>> -------------------------8<--------------
>> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001
>> From: Halil Pasic <pasic@linux.vnet.ibm.com>
>> Date: Thu, 26 Jan 2017 00:06:15 +0100
>> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
>>
>> The reset of is_le does no good, but it contributes its fair share to a
>> bug in vhost_net, which occurs if we have some oldubufs when stopping and
>> setting a fd = -1 as a backend. Instead of doing something convoluted in
>> vhost_net, let's just get rid of the reset.
>>
>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>> Fixes: commit 2751c9882b94 
>> ---
>>  drivers/vhost/vhost.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index d643260..08072a2 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>>         int r;
>>         bool is_le = vq->is_le;
>>
>> -       if (!vq->private_data) {
>> -               vhost_reset_is_le(vq);
>> +       if (!vq->private_data)
>>                 return 0;
>> -       }
>>
>>         vhost_init_is_le(vq);
> 
> 
> I think you do need to reset it, just maybe within vhost_init_is_le.
> 
>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>                 vq->is_le = true;
> 	else
> 		vhost_reset_is_le(vq);
> 
> 

That is a very good point! I have overlooked that while the 
CONFIG_VHOST_CROSS_ENDIAN_LEGACY variant

static void vhost_init_is_le(struct vhost_virtqueue *vq)
{
        /* Note for legacy virtio: user_be is initialized at reset time
         * according to the host endianness. If userspace does not set an
         * explicit endianness, the default behavior is native endian, as
         * expected by legacy virtio.
         */
        vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1) || !vq->user_be;
}

is fine the other variant 

static void vhost_init_is_le(struct vhost_virtqueue *vq)
{
        if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
                vq->is_le = true;
}
is a very strange initializer (makes assumptions about the state
to be initialized).

I agree, setting native endianness there sounds very reasonable.

I have a question regarding readability. IMHO the relationship
of reset_is_le and int_is_le is a bit confusing, and I'm afraid
it could become even more confusing with using reset in one of
the init_is_le's.

How about we do the following?

static void vhost_init_is_le(struct vhost_virtqueue *vq)
{
        if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
                vq->is_le = true;
+       else
+               vq->is_le = virtio_legacy_is_little_endian();

}

static void vhost_reset_is_le(struct vhost_virtqueue *vq)
{
-       vq->is_le = virtio_legacy_is_little_endian();
+       vhost_init_is_le(vq);
}

That way we would have correct endianness both after reset
and after init, I think :).

Thank you very much!

Halil
Greg Kurz Jan. 27, 2017, 1:37 p.m. UTC | #3
On Fri, 27 Jan 2017 13:24:13 +0100
Halil Pasic <pasic@linux.vnet.ibm.com> wrote:

> On 01/26/2017 08:20 PM, Michael S. Tsirkin wrote:
> > On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:  
> >>
> >> Hi!
> >>
> >> Recently I have been investigating some strange migration problems on
> >> s390x.
> >>
> >> It turned out under certain circumstances vhost_net corrupts avail.idx by
> >> using wrong endianness.  
> 
> [..]
> 
> >> -------------------------8<--------------  
> >> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001  
> >> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> >> Date: Thu, 26 Jan 2017 00:06:15 +0100
> >> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
> >>
> >> The reset of is_le does no good, but it contributes its fair share to a
> >> bug in vhost_net, which occurs if we have some oldubufs when stopping and
> >> setting a fd = -1 as a backend. Instead of doing something convoluted in
> >> vhost_net, let's just get rid of the reset.
> >>
> >> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> >> Fixes: commit 2751c9882b94 
> >> ---
> >>  drivers/vhost/vhost.c | 4 +---
> >>  1 file changed, 1 insertion(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> >> index d643260..08072a2 100644
> >> --- a/drivers/vhost/vhost.c
> >> +++ b/drivers/vhost/vhost.c
> >> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
> >>         int r;
> >>         bool is_le = vq->is_le;
> >>
> >> -       if (!vq->private_data) {
> >> -               vhost_reset_is_le(vq);
> >> +       if (!vq->private_data)
> >>                 return 0;
> >> -       }
> >>
> >>         vhost_init_is_le(vq);  
> > 
> > 
> > I think you do need to reset it, just maybe within vhost_init_is_le.
> > 
> >         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> >                 vq->is_le = true;
> > 	else
> > 		vhost_reset_is_le(vq);
> > 
> >   
> 
> That is a very good point! I have overlooked that while the 
> CONFIG_VHOST_CROSS_ENDIAN_LEGACY variant
> 
> static void vhost_init_is_le(struct vhost_virtqueue *vq)
> {
>         /* Note for legacy virtio: user_be is initialized at reset time
>          * according to the host endianness. If userspace does not set an
>          * explicit endianness, the default behavior is native endian, as
>          * expected by legacy virtio.
>          */
>         vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1) || !vq->user_be;
> }
> 
> is fine the other variant 
> 
> static void vhost_init_is_le(struct vhost_virtqueue *vq)
> {
>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>                 vq->is_le = true;
> }
> is a very strange initializer (makes assumptions about the state
> to be initialized).
> 
> I agree, setting native endianness there sounds very reasonable.
> 
> I have a question regarding readability. IMHO the relationship
> of reset_is_le and int_is_le is a bit confusing, and I'm afraid
> it could become even more confusing with using reset in one of
> the init_is_le's.
> 
> How about we do the following?
> 
> static void vhost_init_is_le(struct vhost_virtqueue *vq)
> {
>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>                 vq->is_le = true;
> +       else
> +               vq->is_le = virtio_legacy_is_little_endian();
> 
> }
> 
> static void vhost_reset_is_le(struct vhost_virtqueue *vq)
> {
> -       vq->is_le = virtio_legacy_is_little_endian();
> +       vhost_init_is_le(vq);
> }
> 
> That way we would have correct endianness both after reset
> and after init, I think :).
> 

Yes, I think this is what we need.

Cheers.

--
Greg

> Thank you very much!
> 
> Halil
> 
>
Michael S. Tsirkin Jan. 29, 2017, 12:35 a.m. UTC | #4
On Fri, Jan 27, 2017 at 02:37:47PM +0100, Greg Kurz wrote:
> On Fri, 27 Jan 2017 13:24:13 +0100
> Halil Pasic <pasic@linux.vnet.ibm.com> wrote:
> 
> > On 01/26/2017 08:20 PM, Michael S. Tsirkin wrote:
> > > On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:  
> > >>
> > >> Hi!
> > >>
> > >> Recently I have been investigating some strange migration problems on
> > >> s390x.
> > >>
> > >> It turned out under certain circumstances vhost_net corrupts avail.idx by
> > >> using wrong endianness.  
> > 
> > [..]
> > 
> > >> -------------------------8<--------------  
> > >> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001  
> > >> From: Halil Pasic <pasic@linux.vnet.ibm.com>
> > >> Date: Thu, 26 Jan 2017 00:06:15 +0100
> > >> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
> > >>
> > >> The reset of is_le does no good, but it contributes its fair share to a
> > >> bug in vhost_net, which occurs if we have some oldubufs when stopping and
> > >> setting a fd = -1 as a backend. Instead of doing something convoluted in
> > >> vhost_net, let's just get rid of the reset.
> > >>
> > >> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> > >> Fixes: commit 2751c9882b94 
> > >> ---
> > >>  drivers/vhost/vhost.c | 4 +---
> > >>  1 file changed, 1 insertion(+), 3 deletions(-)
> > >>
> > >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > >> index d643260..08072a2 100644
> > >> --- a/drivers/vhost/vhost.c
> > >> +++ b/drivers/vhost/vhost.c
> > >> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
> > >>         int r;
> > >>         bool is_le = vq->is_le;
> > >>
> > >> -       if (!vq->private_data) {
> > >> -               vhost_reset_is_le(vq);
> > >> +       if (!vq->private_data)
> > >>                 return 0;
> > >> -       }
> > >>
> > >>         vhost_init_is_le(vq);  
> > > 
> > > 
> > > I think you do need to reset it, just maybe within vhost_init_is_le.
> > > 
> > >         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> > >                 vq->is_le = true;
> > > 	else
> > > 		vhost_reset_is_le(vq);
> > > 
> > >   
> > 
> > That is a very good point! I have overlooked that while the 
> > CONFIG_VHOST_CROSS_ENDIAN_LEGACY variant
> > 
> > static void vhost_init_is_le(struct vhost_virtqueue *vq)
> > {
> >         /* Note for legacy virtio: user_be is initialized at reset time
> >          * according to the host endianness. If userspace does not set an
> >          * explicit endianness, the default behavior is native endian, as
> >          * expected by legacy virtio.
> >          */
> >         vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1) || !vq->user_be;
> > }
> > 
> > is fine the other variant 
> > 
> > static void vhost_init_is_le(struct vhost_virtqueue *vq)
> > {
> >         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> >                 vq->is_le = true;
> > }
> > is a very strange initializer (makes assumptions about the state
> > to be initialized).
> > 
> > I agree, setting native endianness there sounds very reasonable.
> > 
> > I have a question regarding readability. IMHO the relationship
> > of reset_is_le and int_is_le is a bit confusing, and I'm afraid
> > it could become even more confusing with using reset in one of
> > the init_is_le's.
> > 
> > How about we do the following?
> > 
> > static void vhost_init_is_le(struct vhost_virtqueue *vq)
> > {
> >         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
> >                 vq->is_le = true;
> > +       else
> > +               vq->is_le = virtio_legacy_is_little_endian();
> > 
> > }
> > 
> > static void vhost_reset_is_le(struct vhost_virtqueue *vq)
> > {
> > -       vq->is_le = virtio_legacy_is_little_endian();
> > +       vhost_init_is_le(vq);
> > }
> > 
> > That way we would have correct endianness both after reset
> > and after init, I think :).
> > 
> 
> Yes, I think this is what we need.
> 
> Cheers.

OK, pls test this patch.

> --
> Greg
> 
> > Thank you very much!
> > 
> > Halil
> > 
> >
Halil Pasic Jan. 30, 2017, 11:25 a.m. UTC | #5
On 01/29/2017 01:35 AM, Michael S. Tsirkin wrote:
> On Fri, Jan 27, 2017 at 02:37:47PM +0100, Greg Kurz wrote:
>> On Fri, 27 Jan 2017 13:24:13 +0100
>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote:
>>
>>> On 01/26/2017 08:20 PM, Michael S. Tsirkin wrote:
>>>> On Thu, Jan 26, 2017 at 06:39:14PM +0100, Halil Pasic wrote:  
>>>>>
>>>>> Hi!
>>>>>
>>>>> Recently I have been investigating some strange migration problems on
>>>>> s390x.
>>>>>
>>>>> It turned out under certain circumstances vhost_net corrupts avail.idx by
>>>>> using wrong endianness.  
>>>
>>> [..]
>>>
>>>>> -------------------------8<--------------  
>>>>> >From b26e2bbdc03832a0204ee2b42967a1b49a277dc8 Mon Sep 17 00:00:00 2001  
>>>>> From: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>>> Date: Thu, 26 Jan 2017 00:06:15 +0100
>>>>> Subject: [PATCH] vhost: remove useless/dangerous reset of is_le
>>>>>
>>>>> The reset of is_le does no good, but it contributes its fair share to a
>>>>> bug in vhost_net, which occurs if we have some oldubufs when stopping and
>>>>> setting a fd = -1 as a backend. Instead of doing something convoluted in
>>>>> vhost_net, let's just get rid of the reset.
>>>>>
>>>>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>>>>> Fixes: commit 2751c9882b94 
>>>>> ---
>>>>>  drivers/vhost/vhost.c | 4 +---
>>>>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>>>> index d643260..08072a2 100644
>>>>> --- a/drivers/vhost/vhost.c
>>>>> +++ b/drivers/vhost/vhost.c
>>>>> @@ -1714,10 +1714,8 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
>>>>>         int r;
>>>>>         bool is_le = vq->is_le;
>>>>>
>>>>> -       if (!vq->private_data) {
>>>>> -               vhost_reset_is_le(vq);
>>>>> +       if (!vq->private_data)
>>>>>                 return 0;
>>>>> -       }
>>>>>
>>>>>         vhost_init_is_le(vq);  
>>>>
>>>>
>>>> I think you do need to reset it, just maybe within vhost_init_is_le.
>>>>
>>>>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>>>>                 vq->is_le = true;
>>>> 	else
>>>> 		vhost_reset_is_le(vq);
>>>>
>>>>   
>>>
>>> That is a very good point! I have overlooked that while the 
>>> CONFIG_VHOST_CROSS_ENDIAN_LEGACY variant
>>>
>>> static void vhost_init_is_le(struct vhost_virtqueue *vq)
>>> {
>>>         /* Note for legacy virtio: user_be is initialized at reset time
>>>          * according to the host endianness. If userspace does not set an
>>>          * explicit endianness, the default behavior is native endian, as
>>>          * expected by legacy virtio.
>>>          */
>>>         vq->is_le = vhost_has_feature(vq, VIRTIO_F_VERSION_1) || !vq->user_be;
>>> }
>>>
>>> is fine the other variant 
>>>
>>> static void vhost_init_is_le(struct vhost_virtqueue *vq)
>>> {
>>>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>>>                 vq->is_le = true;
>>> }
>>> is a very strange initializer (makes assumptions about the state
>>> to be initialized).
>>>
>>> I agree, setting native endianness there sounds very reasonable.
>>>
>>> I have a question regarding readability. IMHO the relationship
>>> of reset_is_le and int_is_le is a bit confusing, and I'm afraid
>>> it could become even more confusing with using reset in one of
>>> the init_is_le's.
>>>
>>> How about we do the following?
>>>
>>> static void vhost_init_is_le(struct vhost_virtqueue *vq)
>>> {
>>>         if (vhost_has_feature(vq, VIRTIO_F_VERSION_1))
>>>                 vq->is_le = true;
>>> +       else
>>> +               vq->is_le = virtio_legacy_is_little_endian();
>>>
>>> }
>>>
>>> static void vhost_reset_is_le(struct vhost_virtqueue *vq)
>>> {
>>> -       vq->is_le = virtio_legacy_is_little_endian();
>>> +       vhost_init_is_le(vq);
>>> }
>>>
>>> That way we would have correct endianness both after reset
>>> and after init, I think :).
>>>
>>
>> Yes, I think this is what we need.
>>
>> Cheers.
> 
> OK, pls test this patch.
> 

Have submitted a slightly modified version of the above patch
with the subject "[PATCH] vhost: fix initialization for vq->is_le".

I wrote to our tester guys to give it some good testing. Will
post an update on the result of the testing there ASAP (or the
someone from testing will).

Regards,
Halil


>> --
>> Greg
>>
>>> Thank you very much!
>>>
>>> Halil
>>>
>>>
>
diff mbox

Patch

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index d643260..08072a2 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1714,10 +1714,8 @@  int vhost_vq_init_access(struct vhost_virtqueue *vq)
        int r;
        bool is_le = vq->is_le;

-       if (!vq->private_data) {
-               vhost_reset_is_le(vq);
+       if (!vq->private_data)
                return 0;
-       }

        vhost_init_is_le(vq);