diff mbox series

[v3,1/2] io/channel: fix crash when qio_channel_readv_all return 0

Message ID 20200508051441.8143-2-fengli@smartx.com (mailing list archive)
State New, archived
Headers show
Series [v3,1/2] io/channel: fix crash when qio_channel_readv_all return 0 | expand

Commit Message

Li Feng May 8, 2020, 5:14 a.m. UTC
Root cause:
From `man recvmsg`, the RETURN VALUE says:
These  calls return the number of bytes received, or -1 if an error occurred.
In the event of an error, errno is set to indicate the error.
The return value will be 0 when the peer has performed an orderly shutdown.

When an error happens, the socket will be closed, and recvmsg return 0,
then error_setg will trigger a crash.

This unit test could reproduce this issue:
tests/test-char -p /char/socket/client/reconnect-error/unix

The core file backtrace is :

(gdb) bt
    #0  0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
    #1  0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
    #2  0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
    #3  0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
        func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
        fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
    #4  0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
        src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
        func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
        fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
    #5  0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
        errp=0x555555802a08 <error_abort>) at io/channel.c:147
    #6  0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
        errp=<optimized out>) at io/channel.c:247
    #7  0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
    #8  0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
    #9  0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
    #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
    #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6

Signed-off-by: Li Feng <fengli@smartx.com>
---
 io/channel.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Marc-André Lureau May 8, 2020, 12:32 p.m. UTC | #1
Hi

On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
>
> Root cause:
> From `man recvmsg`, the RETURN VALUE says:
> These  calls return the number of bytes received, or -1 if an error occurred.
> In the event of an error, errno is set to indicate the error.
> The return value will be 0 when the peer has performed an orderly shutdown.
>
> When an error happens, the socket will be closed, and recvmsg return 0,
> then error_setg will trigger a crash.
>
> This unit test could reproduce this issue:
> tests/test-char -p /char/socket/client/reconnect-error/unix

Current master doesn't trigger the backtrace, it's only after your patch 2.

>
> The core file backtrace is :
>
> (gdb) bt
>     #0  0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
>     #1  0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
>     #2  0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
>     #3  0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
>         func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
>         fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
>     #4  0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
>         src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
>         func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
>         fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
>     #5  0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
>         errp=0x555555802a08 <error_abort>) at io/channel.c:147
>     #6  0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
>         errp=<optimized out>) at io/channel.c:247
>     #7  0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
>     #8  0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
>     #9  0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
>     #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
>     #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
>
> Signed-off-by: Li Feng <fengli@smartx.com>
> ---
>  io/channel.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/io/channel.c b/io/channel.c
> index e4376eb0bc..1a4a505f01 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
>
>      if (ret == 0) {
>          ret = -1;
> -        error_setg(errp,
> -                   "Unexpected end-of-file before all bytes were read");

Nack, this code is fine.

The problem is that the test case doesn't expect a disconnect in
char_socket_ping_pong().
Li Feng May 8, 2020, 12:42 p.m. UTC | #2
Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
>
> Hi
>
> On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> >
> > Root cause:
> > From `man recvmsg`, the RETURN VALUE says:
> > These  calls return the number of bytes received, or -1 if an error occurred.
> > In the event of an error, errno is set to indicate the error.
> > The return value will be 0 when the peer has performed an orderly shutdown.
> >
> > When an error happens, the socket will be closed, and recvmsg return 0,
> > then error_setg will trigger a crash.
> >
> > This unit test could reproduce this issue:
> > tests/test-char -p /char/socket/client/reconnect-error/unix
>
> Current master doesn't trigger the backtrace, it's only after your patch 2.
Yes. However, the issue did exist in the master code base.
The test case in patch 2 revealed this issue.

>
> >
> > The core file backtrace is :
> >
> > (gdb) bt
> >     #0  0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> >     #1  0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> >     #2  0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> >     #3  0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> >         func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> >         fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> >     #4  0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> >         src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> >         func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> >         fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> >     #5  0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> >         errp=0x555555802a08 <error_abort>) at io/channel.c:147
> >     #6  0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> >         errp=<optimized out>) at io/channel.c:247
> >     #7  0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> >     #8  0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> >     #9  0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> >     #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> >     #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> >
> > Signed-off-by: Li Feng <fengli@smartx.com>
> > ---
> >  io/channel.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/io/channel.c b/io/channel.c
> > index e4376eb0bc..1a4a505f01 100644
> > --- a/io/channel.c
> > +++ b/io/channel.c
> > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> >
> >      if (ret == 0) {
> >          ret = -1;
> > -        error_setg(errp,
> > -                   "Unexpected end-of-file before all bytes were read");
>
> Nack, this code is fine.
>
> The problem is that the test case doesn't expect a disconnect in
> char_socket_ping_pong().
Yes, in the test case I try to inject an error in char_socket_ping_pong.
Any concerns about these two patches?
>
> --
> Marc-André Lureau
Daniel P. Berrangé May 11, 2020, 10:02 a.m. UTC | #3
On Fri, May 08, 2020 at 08:42:22PM +0800, Li Feng wrote:
> Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
> >
> > Hi
> >
> > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > >
> > > Root cause:
> > > From `man recvmsg`, the RETURN VALUE says:
> > > These  calls return the number of bytes received, or -1 if an error occurred.
> > > In the event of an error, errno is set to indicate the error.
> > > The return value will be 0 when the peer has performed an orderly shutdown.
> > >
> > > When an error happens, the socket will be closed, and recvmsg return 0,
> > > then error_setg will trigger a crash.
> > >
> > > This unit test could reproduce this issue:
> > > tests/test-char -p /char/socket/client/reconnect-error/unix
> >
> > Current master doesn't trigger the backtrace, it's only after your patch 2.
> Yes. However, the issue did exist in the master code base.
> The test case in patch 2 revealed this issue.
> 
> >
> > >
> > > The core file backtrace is :
> > >
> > > (gdb) bt
> > >     #0  0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > >     #1  0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > >     #2  0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> > >     #3  0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> > >         func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> > >         fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> > >     #4  0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> > >         src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> > >         func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> > >         fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> > >     #5  0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> > >         errp=0x555555802a08 <error_abort>) at io/channel.c:147
> > >     #6  0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> > >         errp=<optimized out>) at io/channel.c:247
> > >     #7  0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> > >     #8  0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> > >     #9  0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> > >     #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > >     #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> > >
> > > Signed-off-by: Li Feng <fengli@smartx.com>
> > > ---
> > >  io/channel.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/io/channel.c b/io/channel.c
> > > index e4376eb0bc..1a4a505f01 100644
> > > --- a/io/channel.c
> > > +++ b/io/channel.c
> > > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> > >
> > >      if (ret == 0) {
> > >          ret = -1;
> > > -        error_setg(errp,
> > > -                   "Unexpected end-of-file before all bytes were read");
> >
> > Nack, this code is fine.
> >
> > The problem is that the test case doesn't expect a disconnect in
> > char_socket_ping_pong().
> Yes, in the test case I try to inject an error in char_socket_ping_pong.
> Any concerns about these two patches?

I agree with Marc-André - this patch is wrong. qio_channel_readv_all
*MUST* always set 'errp' if the return value is -1. To not set 'errp'
is violating the calling convention.

The bug here is in your test case - it is passing '&error_abort' to the
'qio_channel_readv_all' call.  If you don't want the test to crash, then
don't pass &error_abort

Regards,
Daniel
Li Feng May 11, 2020, 12:09 p.m. UTC | #4
Thank you,  Daniel and Marc-André.
I understand now. The error_abort is used to abort the program.
I should rewrite the char_socket_ping_pong.
I will post a new version without this wrong patch.

Thanks,
Feng Li

Daniel P. Berrangé <berrange@redhat.com> 于2020年5月11日周一 下午6:03写道:
>
> On Fri, May 08, 2020 at 08:42:22PM +0800, Li Feng wrote:
> > Marc-André Lureau <marcandre.lureau@gmail.com> 于2020年5月8日周五 下午8:32写道:
> > >
> > > Hi
> > >
> > > On Fri, May 8, 2020 at 7:14 AM Li Feng <fengli@smartx.com> wrote:
> > > >
> > > > Root cause:
> > > > From `man recvmsg`, the RETURN VALUE says:
> > > > These  calls return the number of bytes received, or -1 if an error occurred.
> > > > In the event of an error, errno is set to indicate the error.
> > > > The return value will be 0 when the peer has performed an orderly shutdown.
> > > >
> > > > When an error happens, the socket will be closed, and recvmsg return 0,
> > > > then error_setg will trigger a crash.
> > > >
> > > > This unit test could reproduce this issue:
> > > > tests/test-char -p /char/socket/client/reconnect-error/unix
> > >
> > > Current master doesn't trigger the backtrace, it's only after your patch 2.
> > Yes. However, the issue did exist in the master code base.
> > The test case in patch 2 revealed this issue.
> >
> > >
> > > >
> > > > The core file backtrace is :
> > > >
> > > > (gdb) bt
> > > >     #0  0x00007ffff5ac3277 in raise () from /lib64/libc.so.6
> > > >     #1  0x00007ffff5ac4968 in abort () from /lib64/libc.so.6
> > > >     #2  0x00005555555aaa94 in error_handle_fatal (errp=<optimized out>, err=0x7fffec0012d0) at util/error.c:40
> > > >     #3  0x00005555555aab6d in error_setv (errp=0x555555802a08 <error_abort>, src=0x5555555c4280 "io/channel.c", line=148,
> > > >         func=0x5555555c4580 <__func__.17489> "qio_channel_readv_all", err_class=ERROR_CLASS_GENERIC_ERROR,
> > > >         fmt=<optimized out>, ap=0x7ffff423bae0, suffix=0x0) at util/error.c:73
> > > >     #4  0x00005555555aacf0 in error_setg_internal (errp=errp@entry=0x555555802a08 <error_abort>,
> > > >         src=src@entry=0x5555555c4280 "io/channel.c", line=line@entry=148,
> > > >         func=func@entry=0x5555555c4580 <__func__.17489> "qio_channel_readv_all",
> > > >         fmt=fmt@entry=0x5555555c43a0 "Unexpected end-of-file before all bytes were read") at util/error.c:97
> > > >     #5  0x000055555556c25c in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>,
> > > >         errp=0x555555802a08 <error_abort>) at io/channel.c:147
> > > >     #6  0x000055555556c29a in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>,
> > > >         errp=<optimized out>) at io/channel.c:247
> > > >     #7  0x000055555556ad22 in char_socket_ping_pong (ioc=0x7fffec0008c0) at tests/test-char.c:732
> > > >     #8  0x000055555556ae12 in char_socket_client_server_thread (data=data@entry=0x55555582e350) at tests/test-char.c:891
> > > >     #9  0x00005555555a95b6 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:519
> > > >     #10 0x00007ffff5e61e25 in start_thread () from /lib64/libpthread.so.0
> > > >     #11 0x00007ffff5b8bbad in clone () from /lib64/libc.so.6
> > > >
> > > > Signed-off-by: Li Feng <fengli@smartx.com>
> > > > ---
> > > >  io/channel.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/io/channel.c b/io/channel.c
> > > > index e4376eb0bc..1a4a505f01 100644
> > > > --- a/io/channel.c
> > > > +++ b/io/channel.c
> > > > @@ -144,8 +144,6 @@ int qio_channel_readv_all(QIOChannel *ioc,
> > > >
> > > >      if (ret == 0) {
> > > >          ret = -1;
> > > > -        error_setg(errp,
> > > > -                   "Unexpected end-of-file before all bytes were read");
> > >
> > > Nack, this code is fine.
> > >
> > > The problem is that the test case doesn't expect a disconnect in
> > > char_socket_ping_pong().
> > Yes, in the test case I try to inject an error in char_socket_ping_pong.
> > Any concerns about these two patches?
>
> I agree with Marc-André - this patch is wrong. qio_channel_readv_all
> *MUST* always set 'errp' if the return value is -1. To not set 'errp'
> is violating the calling convention.
>
> The bug here is in your test case - it is passing '&error_abort' to the
> 'qio_channel_readv_all' call.  If you don't want the test to crash, then
> don't pass &error_abort
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
diff mbox series

Patch

diff --git a/io/channel.c b/io/channel.c
index e4376eb0bc..1a4a505f01 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -144,8 +144,6 @@  int qio_channel_readv_all(QIOChannel *ioc,
 
     if (ret == 0) {
         ret = -1;
-        error_setg(errp,
-                   "Unexpected end-of-file before all bytes were read");
     } else if (ret == 1) {
         ret = 0;
     }