diff mbox series

net: vhost-user: add QAPI events to report connection state

Message ID 20250214072629.1033314-1-lvivier@redhat.com (mailing list archive)
State New
Headers show
Series net: vhost-user: add QAPI events to report connection state | expand

Commit Message

Laurent Vivier Feb. 14, 2025, 7:26 a.m. UTC
The netdev reports NETDEV_VHOST_USER_CONNECTED event when
the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
when it is disconnected.

The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
(label, filename and frontend_open).

This allows a system manager like libvirt to detect when the server
fails.

For instance with passt:

{ 'execute': 'qmp_capabilities' }
{ "return": { } }

[killing passt here]

{ "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
  "event": "NETDEV_VHOST_USER_DISCONNECTED",
  "data": { "netdev-id": "netdev0" } }

[automatic reconnection with reconnect-ms]

{ "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
  "event": "NETDEV_VHOST_USER_CONNECTED",
  "data": { "netdev-id": "netdev0",
            "info": { "frontend-open": true,
                      "filename": "unix:",
                      "label": "chr0" } } }

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 qapi/char.json   | 43 +++++++++++++++++++++++++++++++++++++++++++
 net/vhost-user.c |  8 ++++++++
 2 files changed, 51 insertions(+)

Comments

Stefano Brivio Feb. 14, 2025, 8:53 a.m. UTC | #1
On Fri, 14 Feb 2025 08:26:25 +0100
Laurent Vivier <lvivier@redhat.com> wrote:

> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> when it is disconnected.
> 
> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> (label, filename and frontend_open).
> 
> This allows a system manager like libvirt to detect when the server
> fails.
> 
> For instance with passt:
> 
> { 'execute': 'qmp_capabilities' }
> { "return": { } }
> 
> [killing passt here]
> 
> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
>   "event": "NETDEV_VHOST_USER_DISCONNECTED",
>   "data": { "netdev-id": "netdev0" } }
> 
> [automatic reconnection with reconnect-ms]
> 
> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
>   "event": "NETDEV_VHOST_USER_CONNECTED",
>   "data": { "netdev-id": "netdev0",
>             "info": { "frontend-open": true,
>                       "filename": "unix:",
>                       "label": "chr0" } } }
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Tested-by: Stefano Brivio <sbrivio@redhat.com>

...with libvirt's branch from:

  https://gitlab.com/lainestump/libvirt/-/tree/network-passt+vhostuser

simply wiring NETDEV_VHOST_USER_DISCONNECTED to the event handler
qemuMonitorJSONHandleNetdevStreamDisconnected(), that is, the handler
for NETDEV_STREAM_DISCONNECTED. I terminate passt, and it restarts.
Markus Armbruster Feb. 14, 2025, 10:06 a.m. UTC | #2
Laurent Vivier <lvivier@redhat.com> writes:

> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> when it is disconnected.
>
> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> (label, filename and frontend_open).
>
> This allows a system manager like libvirt to detect when the server
> fails.
>
> For instance with passt:
>
> { 'execute': 'qmp_capabilities' }
> { "return": { } }
>
> [killing passt here]
>
> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
>   "event": "NETDEV_VHOST_USER_DISCONNECTED",
>   "data": { "netdev-id": "netdev0" } }
>
> [automatic reconnection with reconnect-ms]
>
> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
>   "event": "NETDEV_VHOST_USER_CONNECTED",
>   "data": { "netdev-id": "netdev0",
>             "info": { "frontend-open": true,
>                       "filename": "unix:",
>                       "label": "chr0" } } }
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Standard question for events: if a management application misses an
event, say because it restarts and reconnects, is there a way to obtain
the missed information with a query command?
Laurent Vivier Feb. 14, 2025, 10:18 a.m. UTC | #3
On 14/02/2025 11:06, Markus Armbruster wrote:
> Laurent Vivier <lvivier@redhat.com> writes:
> 
>> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
>> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
>> when it is disconnected.
>>
>> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
>> (label, filename and frontend_open).
>>
>> This allows a system manager like libvirt to detect when the server
>> fails.
>>
>> For instance with passt:
>>
>> { 'execute': 'qmp_capabilities' }
>> { "return": { } }
>>
>> [killing passt here]
>>
>> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
>>    "event": "NETDEV_VHOST_USER_DISCONNECTED",
>>    "data": { "netdev-id": "netdev0" } }
>>
>> [automatic reconnection with reconnect-ms]
>>
>> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
>>    "event": "NETDEV_VHOST_USER_CONNECTED",
>>    "data": { "netdev-id": "netdev0",
>>              "info": { "frontend-open": true,
>>                        "filename": "unix:",
>>                        "label": "chr0" } } }
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> 
> Standard question for events: if a management application misses an
> event, say because it restarts and reconnects, is there a way to obtain
> the missed information with a query command?
> 

query-chardev could help but it doesn't provide the netdev id.

in HMP, "info network" has the information, but for QMP we had a try with a query-netdev 
in the past but the series has been reverted.

f9bb0c1f9862 ("Revert "qapi: net: Add query-netdev command"")
d32ad10a14d4 ("qapi: net: Add query-netdev command")

Thanks,
Laurent
Thanks
Laurent
Markus Armbruster Feb. 14, 2025, 1:54 p.m. UTC | #4
Laurent Vivier <lvivier@redhat.com> writes:

> On 14/02/2025 11:06, Markus Armbruster wrote:
>> Laurent Vivier <lvivier@redhat.com> writes:
>> 
>>> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
>>> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
>>> when it is disconnected.
>>>
>>> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
>>> (label, filename and frontend_open).
>>>
>>> This allows a system manager like libvirt to detect when the server
>>> fails.
>>>
>>> For instance with passt:
>>>
>>> { 'execute': 'qmp_capabilities' }
>>> { "return": { } }
>>>
>>> [killing passt here]
>>>
>>> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
>>>    "event": "NETDEV_VHOST_USER_DISCONNECTED",
>>>    "data": { "netdev-id": "netdev0" } }
>>>
>>> [automatic reconnection with reconnect-ms]
>>>
>>> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
>>>    "event": "NETDEV_VHOST_USER_CONNECTED",
>>>    "data": { "netdev-id": "netdev0",
>>>              "info": { "frontend-open": true,
>>>                        "filename": "unix:",
>>>                        "label": "chr0" } } }
>>>
>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> 
>> Standard question for events: if a management application misses an
>> event, say because it restarts and reconnects, is there a way to obtain
>> the missed information with a query command?
>
> query-chardev could help but it doesn't provide the netdev id.
>
> in HMP, "info network" has the information, but for QMP we had a try with a query-netdev in the past but the series has been reverted.
>
> f9bb0c1f9862 ("Revert "qapi: net: Add query-netdev command"")
> d32ad10a14d4 ("qapi: net: Add query-netdev command")

Hmm.  Can management applications use these events without a matching
query?
Daniel P. Berrangé Feb. 14, 2025, 1:59 p.m. UTC | #5
On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:
> On 14/02/2025 11:06, Markus Armbruster wrote:
> > Laurent Vivier <lvivier@redhat.com> writes:
> > 
> > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > when it is disconnected.
> > > 
> > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > (label, filename and frontend_open).
> > > 
> > > This allows a system manager like libvirt to detect when the server
> > > fails.
> > > 
> > > For instance with passt:
> > > 
> > > { 'execute': 'qmp_capabilities' }
> > > { "return": { } }
> > > 
> > > [killing passt here]
> > > 
> > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > >    "data": { "netdev-id": "netdev0" } }
> > > 
> > > [automatic reconnection with reconnect-ms]
> > > 
> > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > >    "data": { "netdev-id": "netdev0",
> > >              "info": { "frontend-open": true,
> > >                        "filename": "unix:",
> > >                        "label": "chr0" } } }
> > > 
> > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > 
> > Standard question for events: if a management application misses an
> > event, say because it restarts and reconnects, is there a way to obtain
> > the missed information with a query command?
> > 
> 
> query-chardev could help but it doesn't provide the netdev id.

It doesn't have to IMHO. The application that created the NIC should know
what ID it assigned to both the netdev and chardev, and thus should be
able to use query-chardev to identify the chardev it previously
associated with the netdev.


With regards,
Daniel
Stefano Brivio Feb. 14, 2025, 2:13 p.m. UTC | #6
On Fri, 14 Feb 2025 14:54:33 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Laurent Vivier <lvivier@redhat.com> writes:
> 
> > On 14/02/2025 11:06, Markus Armbruster wrote:  
> >> Laurent Vivier <lvivier@redhat.com> writes:
> >>   
> >>> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> >>> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> >>> when it is disconnected.
> >>>
> >>> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> >>> (label, filename and frontend_open).
> >>>
> >>> This allows a system manager like libvirt to detect when the server
> >>> fails.
> >>>
> >>> For instance with passt:
> >>>
> >>> { 'execute': 'qmp_capabilities' }
> >>> { "return": { } }
> >>>
> >>> [killing passt here]
> >>>
> >>> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> >>>    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> >>>    "data": { "netdev-id": "netdev0" } }
> >>>
> >>> [automatic reconnection with reconnect-ms]
> >>>
> >>> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> >>>    "event": "NETDEV_VHOST_USER_CONNECTED",
> >>>    "data": { "netdev-id": "netdev0",
> >>>              "info": { "frontend-open": true,
> >>>                        "filename": "unix:",
> >>>                        "label": "chr0" } } }
> >>>
> >>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>  
> >> 
> >> Standard question for events: if a management application misses an
> >> event, say because it restarts and reconnects, is there a way to obtain
> >> the missed information with a query command?  
> >
> > query-chardev could help but it doesn't provide the netdev id.
> >
> > in HMP, "info network" has the information, but for QMP we had a try with a query-netdev in the past but the series has been reverted.
> >
> > f9bb0c1f9862 ("Revert "qapi: net: Add query-netdev command"")
> > d32ad10a14d4 ("qapi: net: Add query-netdev command")  
> 
> Hmm.  Can management applications use these events without a matching
> query?

Yes, see https://lore.kernel.org/all/20250214095338.344063fa@elisabeth/
and the existing libvirt implementation for NETDEV_STREAM_DISCONNECTED.
Stefano Brivio Feb. 14, 2025, 2:18 p.m. UTC | #7
On Fri, 14 Feb 2025 13:59:20 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:
> > On 14/02/2025 11:06, Markus Armbruster wrote:  
> > > Laurent Vivier <lvivier@redhat.com> writes:
> > >   
> > > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > > when it is disconnected.
> > > > 
> > > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > > (label, filename and frontend_open).
> > > > 
> > > > This allows a system manager like libvirt to detect when the server
> > > > fails.
> > > > 
> > > > For instance with passt:
> > > > 
> > > > { 'execute': 'qmp_capabilities' }
> > > > { "return": { } }
> > > > 
> > > > [killing passt here]
> > > > 
> > > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > > >    "data": { "netdev-id": "netdev0" } }
> > > > 
> > > > [automatic reconnection with reconnect-ms]
> > > > 
> > > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > > >    "data": { "netdev-id": "netdev0",
> > > >              "info": { "frontend-open": true,
> > > >                        "filename": "unix:",
> > > >                        "label": "chr0" } } }
> > > > 
> > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>  
> > > 
> > > Standard question for events: if a management application misses an
> > > event, say because it restarts and reconnects, is there a way to obtain
> > > the missed information with a query command?
> > >   
> > 
> > query-chardev could help but it doesn't provide the netdev id.  
> 
> It doesn't have to IMHO. The application that created the NIC should know
> what ID it assigned to both the netdev and chardev, and thus should be
> able to use query-chardev to identify the chardev it previously
> associated with the netdev.

It does (of course?), see processNetdevStreamDisconnectedEvent() in
libvirt, src/qemu/qemu_driver.c:

static void
processNetdevStreamDisconnectedEvent(virDomainObj *vm,
                                     const char *netdevId)
{

[...]

    const char *devAlias = STRSKIP(netdevId, "host");

    /* The event sends us the "netdev-id", but we don't store the
     * netdev-id in the NetDef and thus can't use it to find the
     * correct NetDef. We *do* keep the device alias in the NetDef,
     * and by convention the netdev-id is always "host" + devAlias, so
     * we just need to remove "host" from the front of netdev-id to
     * get the alias, which we can then use to find the proper NetDef.
     */

[...]

    if (virDomainDefFindDevice(vm->def, devAlias, &dev, true) < 0) {
        VIR_WARN("NETDEV_STREAM_DISCONNECTED event received for non-existent device %s in domain %s",
                 devAlias, vm->def->name);
        goto endjob;
    }

[...]
Daniel P. Berrangé Feb. 14, 2025, 2:19 p.m. UTC | #8
On Fri, Feb 14, 2025 at 01:59:20PM +0000, Daniel P. Berrangé wrote:
> On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:
> > On 14/02/2025 11:06, Markus Armbruster wrote:
> > > Laurent Vivier <lvivier@redhat.com> writes:
> > > 
> > > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > > when it is disconnected.
> > > > 
> > > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > > (label, filename and frontend_open).
> > > > 
> > > > This allows a system manager like libvirt to detect when the server
> > > > fails.
> > > > 
> > > > For instance with passt:
> > > > 
> > > > { 'execute': 'qmp_capabilities' }
> > > > { "return": { } }
> > > > 
> > > > [killing passt here]
> > > > 
> > > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > > >    "data": { "netdev-id": "netdev0" } }
> > > > 
> > > > [automatic reconnection with reconnect-ms]
> > > > 
> > > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > > >    "data": { "netdev-id": "netdev0",
> > > >              "info": { "frontend-open": true,
> > > >                        "filename": "unix:",
> > > >                        "label": "chr0" } } }
> > > > 
> > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > > 
> > > Standard question for events: if a management application misses an
> > > event, say because it restarts and reconnects, is there a way to obtain
> > > the missed information with a query command?
> > > 
> > 
> > query-chardev could help but it doesn't provide the netdev id.
> 
> It doesn't have to IMHO. The application that created the NIC should know
> what ID it assigned to both the netdev and chardev, and thus should be
> able to use query-chardev to identify the chardev it previously
> associated with the netdev.

That said I kind of wonder whether we should be adding events against
the chardev directly, instead of against the netdev use of chardev.

ie CHARDEV_OPENED / CHARDEV_CLOSED events. For OPENED it is trivially
equivalent. For CLOSED it is a little trickier, since this patch does
delayed event dispatch from a bottom-half. If the chardev code was
able to emit this event from a bottom half itself, that could be OK
though.

With regards,
Daniel
Stefano Brivio Feb. 14, 2025, 2:28 p.m. UTC | #9
On Fri, 14 Feb 2025 14:19:13 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Feb 14, 2025 at 01:59:20PM +0000, Daniel P. Berrangé wrote:
> > On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:  
> > > On 14/02/2025 11:06, Markus Armbruster wrote:  
> > > > Laurent Vivier <lvivier@redhat.com> writes:
> > > >   
> > > > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > > > when it is disconnected.
> > > > > 
> > > > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > > > (label, filename and frontend_open).
> > > > > 
> > > > > This allows a system manager like libvirt to detect when the server
> > > > > fails.
> > > > > 
> > > > > For instance with passt:
> > > > > 
> > > > > { 'execute': 'qmp_capabilities' }
> > > > > { "return": { } }
> > > > > 
> > > > > [killing passt here]
> > > > > 
> > > > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > > > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > > > >    "data": { "netdev-id": "netdev0" } }
> > > > > 
> > > > > [automatic reconnection with reconnect-ms]
> > > > > 
> > > > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > > > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > > > >    "data": { "netdev-id": "netdev0",
> > > > >              "info": { "frontend-open": true,
> > > > >                        "filename": "unix:",
> > > > >                        "label": "chr0" } } }
> > > > > 
> > > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>  
> > > > 
> > > > Standard question for events: if a management application misses an
> > > > event, say because it restarts and reconnects, is there a way to obtain
> > > > the missed information with a query command?
> > > >   
> > > 
> > > query-chardev could help but it doesn't provide the netdev id.  
> > 
> > It doesn't have to IMHO. The application that created the NIC should know
> > what ID it assigned to both the netdev and chardev, and thus should be
> > able to use query-chardev to identify the chardev it previously
> > associated with the netdev.  
> 
> That said I kind of wonder whether we should be adding events against
> the chardev directly, instead of against the netdev use of chardev.

What is the advantage? This already works and is consistent with the
existing non-vhost-user mechanism.
Daniel P. Berrangé Feb. 14, 2025, 2:31 p.m. UTC | #10
On Fri, Feb 14, 2025 at 03:28:34PM +0100, Stefano Brivio wrote:
> On Fri, 14 Feb 2025 14:19:13 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Fri, Feb 14, 2025 at 01:59:20PM +0000, Daniel P. Berrangé wrote:
> > > On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:  
> > > > On 14/02/2025 11:06, Markus Armbruster wrote:  
> > > > > Laurent Vivier <lvivier@redhat.com> writes:
> > > > >   
> > > > > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > > > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > > > > when it is disconnected.
> > > > > > 
> > > > > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > > > > (label, filename and frontend_open).
> > > > > > 
> > > > > > This allows a system manager like libvirt to detect when the server
> > > > > > fails.
> > > > > > 
> > > > > > For instance with passt:
> > > > > > 
> > > > > > { 'execute': 'qmp_capabilities' }
> > > > > > { "return": { } }
> > > > > > 
> > > > > > [killing passt here]
> > > > > > 
> > > > > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > > > > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > > > > >    "data": { "netdev-id": "netdev0" } }
> > > > > > 
> > > > > > [automatic reconnection with reconnect-ms]
> > > > > > 
> > > > > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > > > > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > > > > >    "data": { "netdev-id": "netdev0",
> > > > > >              "info": { "frontend-open": true,
> > > > > >                        "filename": "unix:",
> > > > > >                        "label": "chr0" } } }
> > > > > > 
> > > > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>  
> > > > > 
> > > > > Standard question for events: if a management application misses an
> > > > > event, say because it restarts and reconnects, is there a way to obtain
> > > > > the missed information with a query command?
> > > > >   
> > > > 
> > > > query-chardev could help but it doesn't provide the netdev id.  
> > > 
> > > It doesn't have to IMHO. The application that created the NIC should know
> > > what ID it assigned to both the netdev and chardev, and thus should be
> > > able to use query-chardev to identify the chardev it previously
> > > associated with the netdev.  
> > 
> > That said I kind of wonder whether we should be adding events against
> > the chardev directly, instead of against the netdev use of chardev.
> 
> What is the advantage? This already works and is consistent with the
> existing non-vhost-user mechanism.

The advantage is that chardevs are used in many places across QEMU, so
emitting the events against the chardev is more broadly useful to
consumers of QEMU, than doing a special case that only benefit vhostuser
NICS.

With regards,
Daniel
Stefano Brivio Feb. 14, 2025, 2:38 p.m. UTC | #11
On Fri, 14 Feb 2025 14:31:33 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Feb 14, 2025 at 03:28:34PM +0100, Stefano Brivio wrote:
> > On Fri, 14 Feb 2025 14:19:13 +0000
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Fri, Feb 14, 2025 at 01:59:20PM +0000, Daniel P. Berrangé wrote:  
> > > > On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:    
> > > > > On 14/02/2025 11:06, Markus Armbruster wrote:    
> > > > > > Laurent Vivier <lvivier@redhat.com> writes:
> > > > > >     
> > > > > > > The netdev reports NETDEV_VHOST_USER_CONNECTED event when
> > > > > > > the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
> > > > > > > when it is disconnected.
> > > > > > > 
> > > > > > > The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
> > > > > > > (label, filename and frontend_open).
> > > > > > > 
> > > > > > > This allows a system manager like libvirt to detect when the server
> > > > > > > fails.
> > > > > > > 
> > > > > > > For instance with passt:
> > > > > > > 
> > > > > > > { 'execute': 'qmp_capabilities' }
> > > > > > > { "return": { } }
> > > > > > > 
> > > > > > > [killing passt here]
> > > > > > > 
> > > > > > > { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
> > > > > > >    "event": "NETDEV_VHOST_USER_DISCONNECTED",
> > > > > > >    "data": { "netdev-id": "netdev0" } }
> > > > > > > 
> > > > > > > [automatic reconnection with reconnect-ms]
> > > > > > > 
> > > > > > > { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
> > > > > > >    "event": "NETDEV_VHOST_USER_CONNECTED",
> > > > > > >    "data": { "netdev-id": "netdev0",
> > > > > > >              "info": { "frontend-open": true,
> > > > > > >                        "filename": "unix:",
> > > > > > >                        "label": "chr0" } } }
> > > > > > > 
> > > > > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>    
> > > > > > 
> > > > > > Standard question for events: if a management application misses an
> > > > > > event, say because it restarts and reconnects, is there a way to obtain
> > > > > > the missed information with a query command?
> > > > > >     
> > > > > 
> > > > > query-chardev could help but it doesn't provide the netdev id.    
> > > > 
> > > > It doesn't have to IMHO. The application that created the NIC should know
> > > > what ID it assigned to both the netdev and chardev, and thus should be
> > > > able to use query-chardev to identify the chardev it previously
> > > > associated with the netdev.    
> > > 
> > > That said I kind of wonder whether we should be adding events against
> > > the chardev directly, instead of against the netdev use of chardev.  
> > 
> > What is the advantage? This already works and is consistent with the
> > existing non-vhost-user mechanism.  
> 
> The advantage is that chardevs are used in many places across QEMU, so
> emitting the events against the chardev is more broadly useful to
> consumers of QEMU, than doing a special case that only benefit vhostuser
> NICS.

It's not so much of a special case because it would be perfectly in
line with the NETDEV_STREAM_DISCONNECTED event and its usage in
applications (the only current user being libvirt).

That's a proven approach and the matching implementation for passt is
trivial because of that.
Laurent Vivier Feb. 17, 2025, 10:22 a.m. UTC | #12
On 14/02/2025 15:19, Daniel P. Berrangé wrote:
> On Fri, Feb 14, 2025 at 01:59:20PM +0000, Daniel P. Berrangé wrote:
>> On Fri, Feb 14, 2025 at 11:18:55AM +0100, Laurent Vivier wrote:
>>> On 14/02/2025 11:06, Markus Armbruster wrote:
>>>> Laurent Vivier <lvivier@redhat.com> writes:
>>>>
>>>>> The netdev reports NETDEV_VHOST_USER_CONNECTED event when
>>>>> the chardev is connected, and NETDEV_VHOST_USER_DISCONNECTED
>>>>> when it is disconnected.
>>>>>
>>>>> The NETDEV_VHOST_USER_CONNECTED event includes the ChardevInfo
>>>>> (label, filename and frontend_open).
>>>>>
>>>>> This allows a system manager like libvirt to detect when the server
>>>>> fails.
>>>>>
>>>>> For instance with passt:
>>>>>
>>>>> { 'execute': 'qmp_capabilities' }
>>>>> { "return": { } }
>>>>>
>>>>> [killing passt here]
>>>>>
>>>>> { "timestamp": { "seconds": 1739517243, "microseconds": 115081 },
>>>>>     "event": "NETDEV_VHOST_USER_DISCONNECTED",
>>>>>     "data": { "netdev-id": "netdev0" } }
>>>>>
>>>>> [automatic reconnection with reconnect-ms]
>>>>>
>>>>> { "timestamp": { "seconds": 1739517290, "microseconds": 343777 },
>>>>>     "event": "NETDEV_VHOST_USER_CONNECTED",
>>>>>     "data": { "netdev-id": "netdev0",
>>>>>               "info": { "frontend-open": true,
>>>>>                         "filename": "unix:",
>>>>>                         "label": "chr0" } } }
>>>>>
>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>
>>>> Standard question for events: if a management application misses an
>>>> event, say because it restarts and reconnects, is there a way to obtain
>>>> the missed information with a query command?
>>>>
>>>
>>> query-chardev could help but it doesn't provide the netdev id.
>>
>> It doesn't have to IMHO. The application that created the NIC should know
>> what ID it assigned to both the netdev and chardev, and thus should be
>> able to use query-chardev to identify the chardev it previously
>> associated with the netdev.
> 
> That said I kind of wonder whether we should be adding events against
> the chardev directly, instead of against the netdev use of chardev.

I don't know that is the best from a libvirt point of view.

> 
> ie CHARDEV_OPENED / CHARDEV_CLOSED events. For OPENED it is trivially
> equivalent. For CLOSED it is a little trickier, since this patch does
> delayed event dispatch from a bottom-half. If the chardev code was
> able to emit this event from a bottom half itself, that could be OK
> though.

Do we really need to delay the event?
libvirt could check the chardev user is really down before restarting it?

Without bh it's very simple:

@@ -68,9 +69,11 @@ void qemu_chr_be_event(Chardev *s, QEMUChrEvent event)
      switch (event) {
          case CHR_EVENT_OPENED:
              s->be_open = 1;
+            qapi_event_send_chardev_connected(s->label);
              break;
          case CHR_EVENT_CLOSED:
              s->be_open = 0;
+            qapi_event_send_chardev_disconnected(s->label);
              break;
      case CHR_EVENT_BREAK:
      case CHR_EVENT_MUX_IN:

Thanks,
Laurent

> With regards,
> Daniel
diff mbox series

Patch

diff --git a/qapi/char.json b/qapi/char.json
index f02b66c06b3e..a8094c86fb49 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -902,3 +902,46 @@ 
 { 'event': 'VSERPORT_CHANGE',
   'data': { 'id': 'str',
             'open': 'bool' } }
+
+##
+# @NETDEV_VHOST_USER_CONNECTED:
+#
+# Emitted when the vhost-user chardev is connected
+#
+# @netdev-id: QEMU netdev id that is connected
+#
+# @info: The chardev information
+#
+# Since: 10.0
+#
+# .. qmp-example::
+#
+#     <- { "timestamp": { "seconds": 1739469793, "microseconds": 683713 },
+#          "event": "NETDEV_VHOST_USER_CONNECTED",
+#          "data": { "netdev-id": "netdev0",
+#                   "info": { "frontend-open": true,
+#                             "filename": "unix:",
+#                             "label": "chr0" } } }
+#
+##
+{ 'event': 'NETDEV_VHOST_USER_CONNECTED',
+  'data': { 'netdev-id': 'str', 'info': 'ChardevInfo' } }
+
+##
+# @NETDEV_VHOST_USER_DISCONNECTED:
+#
+# Emitted when the vhost-user chardev is disconnected
+#
+# @netdev-id: QEMU netdev id that is disconnected
+#
+# Since: 10.0
+#
+# .. qmp-example::
+#
+#     <- { "timestamp": { "seconds": 1739469786, "microseconds": 822220 },
+#          "event": "NETDEV_VHOST_USER_DISCONNECTED",
+#          "data": { "netdev-id": "netdev0" } }
+#
+##
+{ 'event': 'NETDEV_VHOST_USER_DISCONNECTED',
+  'data': { 'netdev-id': 'str' } }
diff --git a/net/vhost-user.c b/net/vhost-user.c
index 12555518e838..841b94e785a4 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -16,6 +16,7 @@ 
 #include "chardev/char-fe.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-net.h"
+#include "qapi/qapi-events-char.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
 #include "qemu/option.h"
@@ -271,6 +272,7 @@  static void chr_closed_bh(void *opaque)
     if (err) {
         error_report_err(err);
     }
+    qapi_event_send_netdev_vhost_user_disconnected(name);
 }
 
 static void net_vhost_user_event(void *opaque, QEMUChrEvent event)
@@ -278,6 +280,7 @@  static void net_vhost_user_event(void *opaque, QEMUChrEvent event)
     const char *name = opaque;
     NetClientState *ncs[MAX_QUEUE_NUM];
     NetVhostUserState *s;
+    ChardevInfo info;
     Chardev *chr;
     Error *err = NULL;
     int queues;
@@ -300,6 +303,11 @@  static void net_vhost_user_event(void *opaque, QEMUChrEvent event)
                                          net_vhost_user_watch, s);
         qmp_set_link(name, true, &err);
         s->started = true;
+        info.label = chr->label;
+        info.filename = chr->filename;
+        info.frontend_open = chr->filename;
+        info.frontend_open = chr->be && chr->be->fe_is_open;
+        qapi_event_send_netdev_vhost_user_connected(name, &info);
         break;
     case CHR_EVENT_CLOSED:
         /* a close event may happen during a read/write, but vhost