mbox series

[XEN,for-4.13,v2,0/6] Fix: libxl workaround, multiple connection to single QMP socket

Message ID 20191030180704.261320-1-anthony.perard@citrix.com (mailing list archive)
Headers show
Series Fix: libxl workaround, multiple connection to single QMP socket | expand

Message

Anthony PERARD Oct. 30, 2019, 6:06 p.m. UTC
Patch series available in this git branch:
https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.fix-ev_qmp-multi-connect-v2

Hi,

QEMU's QMP socket doesn't allow multiple concurrent connection. Also, it
listen() on the socket with a `backlog' of only 1. On Linux at least, once that
backlog is filled connect() will return EAGAIN if the socket fd is
non-blocking. libxl may attempt many concurrent connect() attempt if for
example a guest is started with several PCI passthrough devices, and a
connect() failure lead to a failure to start the guest.

Since we can't change the listen()'s `backlog' that QEMU use, we need other
ways to workaround the issue. This patch series introduce a lock to acquire
before attempting to connect() to the QMP socket. Since the lock might be held
for to long, the series also introduce a way to cancel the acquisition of the
lock, this means killing the process that tries to get the lock.

See thread[1] for discussed alternative.
[1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01815.html

Cheers,

Anthony PERARD (6):
  libxl: Introduce libxl__ev_child_kill_deregister
  libxl: Move libxl__ev_devlock declaration
  libxl: Rename ev_devlock to ev_slowlock
  libxl: Introduce libxl__ev_slowlock_dispose
  libxl: libxl__ev_qmp_send now takes an egc
  libxl_qmp: Have a lock for QMP socket access

 tools/libxl/libxl_disk.c        |  16 ++--
 tools/libxl/libxl_dm.c          |   8 +-
 tools/libxl/libxl_dom_save.c    |   2 +-
 tools/libxl/libxl_dom_suspend.c |   2 +-
 tools/libxl/libxl_domain.c      |  18 ++---
 tools/libxl/libxl_event.c       |   6 +-
 tools/libxl/libxl_fork.c        |  48 ++++++++++++
 tools/libxl/libxl_internal.c    |  41 +++++++---
 tools/libxl/libxl_internal.h    | 130 +++++++++++++++++++-------------
 tools/libxl/libxl_pci.c         |   8 +-
 tools/libxl/libxl_qmp.c         | 119 ++++++++++++++++++++++++-----
 tools/libxl/libxl_usb.c         |  28 ++++---
 12 files changed, 301 insertions(+), 125 deletions(-)

Comments

Sander Eikelenboom Oct. 30, 2019, 10:27 p.m. UTC | #1
On 30/10/2019 19:06, Anthony PERARD wrote:
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.fix-ev_qmp-multi-connect-v2
> 
> Hi,
> 
> QEMU's QMP socket doesn't allow multiple concurrent connection. Also, it
> listen() on the socket with a `backlog' of only 1. On Linux at least, once that
> backlog is filled connect() will return EAGAIN if the socket fd is
> non-blocking. libxl may attempt many concurrent connect() attempt if for
> example a guest is started with several PCI passthrough devices, and a
> connect() failure lead to a failure to start the guest.
> 
> Since we can't change the listen()'s `backlog' that QEMU use, we need other
> ways to workaround the issue. This patch series introduce a lock to acquire
> before attempting to connect() to the QMP socket. Since the lock might be held
> for to long, the series also introduce a way to cancel the acquisition of the
> lock, this means killing the process that tries to get the lock.
> 
> See thread[1] for discussed alternative.
> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01815.html
> 
> Cheers,
> 
> Anthony PERARD (6):

Hi Anthony,

Re-tested, especially the pci-pt part, still works for me.
Thanks again (and thanks for providing a git branch)

--
Sander
Jürgen Groß Nov. 8, 2019, 6:06 a.m. UTC | #2
On 30.10.19 19:06, Anthony PERARD wrote:
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.fix-ev_qmp-multi-connect-v2
> 
> Hi,
> 
> QEMU's QMP socket doesn't allow multiple concurrent connection. Also, it
> listen() on the socket with a `backlog' of only 1. On Linux at least, once that
> backlog is filled connect() will return EAGAIN if the socket fd is
> non-blocking. libxl may attempt many concurrent connect() attempt if for
> example a guest is started with several PCI passthrough devices, and a
> connect() failure lead to a failure to start the guest.
> 
> Since we can't change the listen()'s `backlog' that QEMU use, we need other
> ways to workaround the issue. This patch series introduce a lock to acquire
> before attempting to connect() to the QMP socket. Since the lock might be held
> for to long, the series also introduce a way to cancel the acquisition of the
> lock, this means killing the process that tries to get the lock.
> 
> See thread[1] for discussed alternative.
> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01815.html
> 
> Cheers,
> 
> Anthony PERARD (6):
>    libxl: Introduce libxl__ev_child_kill_deregister
>    libxl: Move libxl__ev_devlock declaration
>    libxl: Rename ev_devlock to ev_slowlock
>    libxl: Introduce libxl__ev_slowlock_dispose
>    libxl: libxl__ev_qmp_send now takes an egc
>    libxl_qmp: Have a lock for QMP socket access
> 
>   tools/libxl/libxl_disk.c        |  16 ++--
>   tools/libxl/libxl_dm.c          |   8 +-
>   tools/libxl/libxl_dom_save.c    |   2 +-
>   tools/libxl/libxl_dom_suspend.c |   2 +-
>   tools/libxl/libxl_domain.c      |  18 ++---
>   tools/libxl/libxl_event.c       |   6 +-
>   tools/libxl/libxl_fork.c        |  48 ++++++++++++
>   tools/libxl/libxl_internal.c    |  41 +++++++---
>   tools/libxl/libxl_internal.h    | 130 +++++++++++++++++++-------------
>   tools/libxl/libxl_pci.c         |   8 +-
>   tools/libxl/libxl_qmp.c         | 119 ++++++++++++++++++++++++-----
>   tools/libxl/libxl_usb.c         |  28 ++++---
>   12 files changed, 301 insertions(+), 125 deletions(-)
> 

For the series:

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen
Sander Eikelenboom Nov. 15, 2019, 1:51 p.m. UTC | #3
On 08/11/2019 07:06, Jürgen Groß wrote:
> On 30.10.19 19:06, Anthony PERARD wrote:
>> Patch series available in this git branch:
>> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.fix-ev_qmp-multi-connect-v2
>>
>> Hi,
>>
>> QEMU's QMP socket doesn't allow multiple concurrent connection. Also, it
>> listen() on the socket with a `backlog' of only 1. On Linux at least, once that
>> backlog is filled connect() will return EAGAIN if the socket fd is
>> non-blocking. libxl may attempt many concurrent connect() attempt if for
>> example a guest is started with several PCI passthrough devices, and a
>> connect() failure lead to a failure to start the guest.
>>
>> Since we can't change the listen()'s `backlog' that QEMU use, we need other
>> ways to workaround the issue. This patch series introduce a lock to acquire
>> before attempting to connect() to the QMP socket. Since the lock might be held
>> for to long, the series also introduce a way to cancel the acquisition of the
>> lock, this means killing the process that tries to get the lock.
>>
>> See thread[1] for discussed alternative.
>> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01815.html
>>
>> Cheers,
>>
>> Anthony PERARD (6):
>>    libxl: Introduce libxl__ev_child_kill_deregister
>>    libxl: Move libxl__ev_devlock declaration
>>    libxl: Rename ev_devlock to ev_slowlock
>>    libxl: Introduce libxl__ev_slowlock_dispose
>>    libxl: libxl__ev_qmp_send now takes an egc
>>    libxl_qmp: Have a lock for QMP socket access
>>
>>   tools/libxl/libxl_disk.c        |  16 ++--
>>   tools/libxl/libxl_dm.c          |   8 +-
>>   tools/libxl/libxl_dom_save.c    |   2 +-
>>   tools/libxl/libxl_dom_suspend.c |   2 +-
>>   tools/libxl/libxl_domain.c      |  18 ++---
>>   tools/libxl/libxl_event.c       |   6 +-
>>   tools/libxl/libxl_fork.c        |  48 ++++++++++++
>>   tools/libxl/libxl_internal.c    |  41 +++++++---
>>   tools/libxl/libxl_internal.h    | 130 +++++++++++++++++++-------------
>>   tools/libxl/libxl_pci.c         |   8 +-
>>   tools/libxl/libxl_qmp.c         | 119 ++++++++++++++++++++++++-----
>>   tools/libxl/libxl_usb.c         |  28 ++++---
>>   12 files changed, 301 insertions(+), 125 deletions(-)
>>
> 
> For the series:
> 
> Release-acked-by: Juergen Gross <jgross@suse.com>
> 
> 
> Juergen

Hi Juergen,

Since a lot more recent patches have been committed, but these don't seem to.
I was wondering if these fell through the cracks.

--
Sander
Jürgen Groß Nov. 15, 2019, 1:56 p.m. UTC | #4
On 15.11.19 14:51, Sander Eikelenboom wrote:
> On 08/11/2019 07:06, Jürgen Groß wrote:
>> On 30.10.19 19:06, Anthony PERARD wrote:
>>> Patch series available in this git branch:
>>> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.fix-ev_qmp-multi-connect-v2
>>>
>>> Hi,
>>>
>>> QEMU's QMP socket doesn't allow multiple concurrent connection. Also, it
>>> listen() on the socket with a `backlog' of only 1. On Linux at least, once that
>>> backlog is filled connect() will return EAGAIN if the socket fd is
>>> non-blocking. libxl may attempt many concurrent connect() attempt if for
>>> example a guest is started with several PCI passthrough devices, and a
>>> connect() failure lead to a failure to start the guest.
>>>
>>> Since we can't change the listen()'s `backlog' that QEMU use, we need other
>>> ways to workaround the issue. This patch series introduce a lock to acquire
>>> before attempting to connect() to the QMP socket. Since the lock might be held
>>> for to long, the series also introduce a way to cancel the acquisition of the
>>> lock, this means killing the process that tries to get the lock.
>>>
>>> See thread[1] for discussed alternative.
>>> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01815.html
>>>
>>> Cheers,
>>>
>>> Anthony PERARD (6):
>>>     libxl: Introduce libxl__ev_child_kill_deregister
>>>     libxl: Move libxl__ev_devlock declaration
>>>     libxl: Rename ev_devlock to ev_slowlock
>>>     libxl: Introduce libxl__ev_slowlock_dispose
>>>     libxl: libxl__ev_qmp_send now takes an egc
>>>     libxl_qmp: Have a lock for QMP socket access
>>>
>>>    tools/libxl/libxl_disk.c        |  16 ++--
>>>    tools/libxl/libxl_dm.c          |   8 +-
>>>    tools/libxl/libxl_dom_save.c    |   2 +-
>>>    tools/libxl/libxl_dom_suspend.c |   2 +-
>>>    tools/libxl/libxl_domain.c      |  18 ++---
>>>    tools/libxl/libxl_event.c       |   6 +-
>>>    tools/libxl/libxl_fork.c        |  48 ++++++++++++
>>>    tools/libxl/libxl_internal.c    |  41 +++++++---
>>>    tools/libxl/libxl_internal.h    | 130 +++++++++++++++++++-------------
>>>    tools/libxl/libxl_pci.c         |   8 +-
>>>    tools/libxl/libxl_qmp.c         | 119 ++++++++++++++++++++++++-----
>>>    tools/libxl/libxl_usb.c         |  28 ++++---
>>>    12 files changed, 301 insertions(+), 125 deletions(-)
>>>
>>
>> For the series:
>>
>> Release-acked-by: Juergen Gross <jgross@suse.com>
>>
>>
>> Juergen
> 
> Hi Juergen,
> 
> Since a lot more recent patches have been committed, but these don't seem to.
> I was wondering if these fell through the cracks.

Hi Sander,

I'm no committer, "just" the one who allows the committers to take a
patch or series at that phase of the release...

Juergen
Wei Liu Nov. 15, 2019, 2:01 p.m. UTC | #5
On Fri, Nov 15, 2019 at 02:56:48PM +0100, Jürgen Groß wrote:
> > > 
> > > Juergen
> > 
> > Hi Juergen,
> > 
> > Since a lot more recent patches have been committed, but these don't seem to.
> > I was wondering if these fell through the cracks.
> 
> Hi Sander,
> 
> I'm no committer, "just" the one who allows the committers to take a
> patch or series at that phase of the release...

This series hasn't been reviewed yet.

Unfortunately, -ETIME for me at the moment.

Wei.

> 
> Juergen