Message ID | 20190903134442.15653-1-pl@kamp.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block/nfs: add support for nfs_umount | expand |
Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: > libnfs recently added support for unmounting. Add support > in Qemu too. > > Signed-off-by: Peter Lieven <pl@kamp.de> Looks trivial enough to review even for me. :-) Thanks, applied to the block branch. Kevin
> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: > > Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: >> libnfs recently added support for unmounting. Add support >> in Qemu too. >> >> Signed-off-by: Peter Lieven <pl@kamp.de> > > Looks trivial enough to review even for me. :-) > > Thanks, applied to the block branch. > > Kevin I am not sure what the reason is, but with this patch I sometimes run into nfs_process_read being called for a cdrom mounted from nfs after I ejected it (and the whole nfs client context is already destroyed). It seems that the following fixes the issue. It might be that the nfs_umount call just reveals this bug. nfs_close is also doing sync I/O with the libnfs library. If we mount a nfs share we are doing everything with sync calls and then set up the aio stuff with nfs_set_events. I think that we need to stop the aio before we are executing the sync calls to nfs_close and nfs_umount. Also not sure if holding the mutex is necessary here. diff --git a/block/nfs.c b/block/nfs.c index ffa6484c1a..cb2e0d375a 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -389,6 +389,10 @@ static void nfs_attach_aio_context(BlockDriverState *bs, static void nfs_client_close(NFSClient *client) { if (client->context) { + qemu_mutex_lock(&client->mutex); + aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), + false, NULL, NULL, NULL, NULL); + qemu_mutex_unlock(&client->mutex); if (client->fh) { nfs_close(client->context, client->fh); client->fh = NULL; @@ -396,8 +400,6 @@ static void nfs_client_close(NFSClient *client) #ifdef LIBNFS_FEATURE_UMOUNT nfs_umount(client->context); #endif - aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), - false, NULL, NULL, NULL, NULL); nfs_destroy_context(client->context); client->context = NULL; } Peter
Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: > > > > Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: > > > > Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: > >> libnfs recently added support for unmounting. Add support > >> in Qemu too. > >> > >> Signed-off-by: Peter Lieven <pl@kamp.de> > > > > Looks trivial enough to review even for me. :-) > > > > Thanks, applied to the block branch. > > > > Kevin > > I am not sure what the reason is, but with this patch I sometimes run > into nfs_process_read being called for a cdrom mounted from nfs after > I ejected it (and the whole nfs client context is already destroyed). Does this mean that nfs_umount() gets some response, but we don't properly wait for it? Or is some older request still in flight? > It seems that the following fixes the issue. It might be that the > nfs_umount call just reveals this bug. nfs_close is also doing sync > I/O with the libnfs library. If we mount a nfs share we are doing > everything with sync calls and then set up the aio stuff with > nfs_set_events. I think that we need to stop the aio before we are > executing the sync calls to nfs_close and nfs_umount. Disabling the fd handlers is probably just papering over the problem. If there is something in flight, we should wait for its completion. In theory, block devices are drained before calling .bdrv_close, so either we have a bug with draining nfs nodes correctly, or .bdrv_close initiates something new itself that we should wait for. > Also not sure if holding the mutex is necessary here. Not for aio_set_fd_handler(), but I don't know the requirements of nfs_get_fd(). Kevin > diff --git a/block/nfs.c b/block/nfs.c > index ffa6484c1a..cb2e0d375a 100644 > --- a/block/nfs.c > +++ b/block/nfs.c > @@ -389,6 +389,10 @@ static void nfs_attach_aio_context(BlockDriverState *bs, > static void nfs_client_close(NFSClient *client) > { > if (client->context) { > + qemu_mutex_lock(&client->mutex); > + aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), > + false, NULL, NULL, NULL, NULL); > + qemu_mutex_unlock(&client->mutex); > if (client->fh) { > nfs_close(client->context, client->fh); > client->fh = NULL; > @@ -396,8 +400,6 @@ static void nfs_client_close(NFSClient *client) > #ifdef LIBNFS_FEATURE_UMOUNT > nfs_umount(client->context); > #endif > - aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), > - false, NULL, NULL, NULL, NULL); > nfs_destroy_context(client->context); > client->context = NULL; > }
Am 04.09.19 um 11:34 schrieb Kevin Wolf: > Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: >> >>> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: >>> >>> Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: >>>> libnfs recently added support for unmounting. Add support >>>> in Qemu too. >>>> >>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>> Looks trivial enough to review even for me. :-) >>> >>> Thanks, applied to the block branch. >>> >>> Kevin >> I am not sure what the reason is, but with this patch I sometimes run >> into nfs_process_read being called for a cdrom mounted from nfs after >> I ejected it (and the whole nfs client context is already destroyed). > Does this mean that nfs_umount() gets some response, but we don't > properly wait for it? Or is some older request still in flight? nfs_umount itself is a sync call and should only terminate when the call is done. But there is an independent I/O handler in that function polling on the fd. (wait_for_nfs_reply in libnfs-sync.c). This is why I thought the right solution is to stop the Qemu I/O handler before calling nfs_close and nfs_umount. nfs_close also uses this sync I/O handler, but for some reason it seems not to make trouble. The other solution would be to use the async versions of close and umount, but that would make the code in Qemu more complex. Peter
On Thu, Sep 5, 2019 at 7:43 PM Peter Lieven <pl@kamp.de> wrote: > > Am 04.09.19 um 11:34 schrieb Kevin Wolf: > > Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: > >> > >>> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: > >>> > >>> Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: > >>>> libnfs recently added support for unmounting. Add support > >>>> in Qemu too. > >>>> > >>>> Signed-off-by: Peter Lieven <pl@kamp.de> > >>> Looks trivial enough to review even for me. :-) > >>> > >>> Thanks, applied to the block branch. > >>> > >>> Kevin > >> I am not sure what the reason is, but with this patch I sometimes run > >> into nfs_process_read being called for a cdrom mounted from nfs after > >> I ejected it (and the whole nfs client context is already destroyed). > > Does this mean that nfs_umount() gets some response, but we don't > > properly wait for it? Or is some older request still in flight? > > > nfs_umount itself is a sync call and should only terminate when > > the call is done. But there is an independent I/O handler in that > > function polling on the fd. (wait_for_nfs_reply in libnfs-sync.c). > > This is why I thought the right solution is to stop the Qemu I/O handler > > before calling nfs_close and nfs_umount. nfs_close also uses this > > sync I/O handler, but for some reason it seems not to make trouble. > > > The other solution would be to use the async versions of close and umount, > > but that would make the code in Qemu more complex. > > NFS umount is pretty messy so I think you should continue using the sync version. In NFSv3 (there is no mount protocol in v4) the Mount call (fetch the root filehandle) and the Umount calls (tell server we should no longer show up in showexports -a output) are not part of the NFS protocol but a different service running on a separate port. This does not map well to libnfs since it is centered around a "struct nfs_context". To use nfs_umount() from QEMU I would suggest : 1, make sure all commands in flight have finished, because you will soon disconnect from the NFS server and will never receive any in-flight responses. 2, unregister the nfs->fh filedescriptor from your eventsystem. Because the fd is about to be closed so there is great chance it will be recycled for a completely different purpose if you open any other files from qemu. 3, call nfs_umount() Internally this will close the socket to the NFS server, then go through thr process to open a new socket to the portmapper to discover the mount server, then close that socket and reconnect a new socket again to the mount server and perform the UMNT call. ronnie sahlberg > Peter > >
Am 05.09.19 um 12:05 schrieb ronnie sahlberg: > On Thu, Sep 5, 2019 at 7:43 PM Peter Lieven <pl@kamp.de> wrote: >> Am 04.09.19 um 11:34 schrieb Kevin Wolf: >>> Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: >>>>> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: >>>>> >>>>> Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: >>>>>> libnfs recently added support for unmounting. Add support >>>>>> in Qemu too. >>>>>> >>>>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>>>> Looks trivial enough to review even for me. :-) >>>>> >>>>> Thanks, applied to the block branch. >>>>> >>>>> Kevin >>>> I am not sure what the reason is, but with this patch I sometimes run >>>> into nfs_process_read being called for a cdrom mounted from nfs after >>>> I ejected it (and the whole nfs client context is already destroyed). >>> Does this mean that nfs_umount() gets some response, but we don't >>> properly wait for it? Or is some older request still in flight? >> >> nfs_umount itself is a sync call and should only terminate when >> >> the call is done. But there is an independent I/O handler in that >> >> function polling on the fd. (wait_for_nfs_reply in libnfs-sync.c). >> >> This is why I thought the right solution is to stop the Qemu I/O handler >> >> before calling nfs_close and nfs_umount. nfs_close also uses this >> >> sync I/O handler, but for some reason it seems not to make trouble. >> >> >> The other solution would be to use the async versions of close and umount, >> >> but that would make the code in Qemu more complex. >> >> > NFS umount is pretty messy so I think you should continue using the > sync version. > In NFSv3 (there is no mount protocol in v4) the Mount call (fetch the > root filehandle) > and the Umount calls (tell server we should no longer show up in > showexports -a output) > are not part of the NFS protocol but a different service running on a > separate port. > > This does not map well to libnfs since it is centered around a "struct > nfs_context". > > To use nfs_umount() from QEMU I would suggest : > 1, make sure all commands in flight have finished, because you will > soon disconnect from the NFS server and will never receive any > in-flight responses. > 2, unregister the nfs->fh filedescriptor from your eventsystem. > Because the fd is about to be closed so there is great chance it will > be recycled for a completely different purpose if you open any other > files from qemu. > > 3, call nfs_umount() Internally this will close the socket to the > NFS server, then go through thr process to open a new socket to the > portmapper to discover the mount server, then close that socket and > reconnect a new socket again to the mount server and perform the UMNT > call. What we currently do in Qemu is: 1) bdrv_drain 2) bdrv_close which in the end calls nfs_client_close from block/nfs.c. There we call: 2a) nfs_close(client->fh) 2b) aio_set_fd_handler(NULL) 2c) nfs_destroy_context(client->context); My first patch added a nfs_umount between 2a) and 2b) so that we have 2a) nfs_close(client->fh) 2b) nfs_umount(client->context) 2c) aio_set_fd_handler(NULL) 2d) nfs_destroy_context(client->context); This leads to triggering to assertion for an uninitialized client->mutex which is called from an invocation of nfs_process_read after nfs_destroy_context was called. If I change the order as following I see no more assertions: 2a) aio_set_fd_handler(NULL) 2b) nfs_close(client->fh) 2c) nfs_umount(client->context) 2d) nfs_destroy_context(client->context); I think we should have done this in the first place, because nfs_close (and nfs_umount) poll on the nfs_fd in parallel if we use the sync calls. Peter
On Thu, Sep 5, 2019 at 8:16 PM Peter Lieven <pl@kamp.de> wrote: > > Am 05.09.19 um 12:05 schrieb ronnie sahlberg: > > On Thu, Sep 5, 2019 at 7:43 PM Peter Lieven <pl@kamp.de> wrote: > >> Am 04.09.19 um 11:34 schrieb Kevin Wolf: > >>> Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: > >>>>> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: > >>>>> > >>>>> Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: > >>>>>> libnfs recently added support for unmounting. Add support > >>>>>> in Qemu too. > >>>>>> > >>>>>> Signed-off-by: Peter Lieven <pl@kamp.de> > >>>>> Looks trivial enough to review even for me. :-) > >>>>> > >>>>> Thanks, applied to the block branch. > >>>>> > >>>>> Kevin > >>>> I am not sure what the reason is, but with this patch I sometimes run > >>>> into nfs_process_read being called for a cdrom mounted from nfs after > >>>> I ejected it (and the whole nfs client context is already destroyed). > >>> Does this mean that nfs_umount() gets some response, but we don't > >>> properly wait for it? Or is some older request still in flight? > >> > >> nfs_umount itself is a sync call and should only terminate when > >> > >> the call is done. But there is an independent I/O handler in that > >> > >> function polling on the fd. (wait_for_nfs_reply in libnfs-sync.c). > >> > >> This is why I thought the right solution is to stop the Qemu I/O handler > >> > >> before calling nfs_close and nfs_umount. nfs_close also uses this > >> > >> sync I/O handler, but for some reason it seems not to make trouble. > >> > >> > >> The other solution would be to use the async versions of close and umount, > >> > >> but that would make the code in Qemu more complex. > >> > >> > > NFS umount is pretty messy so I think you should continue using the > > sync version. > > In NFSv3 (there is no mount protocol in v4) the Mount call (fetch the > > root filehandle) > > and the Umount calls (tell server we should no longer show up in > > showexports -a output) > > are not part of the NFS protocol but a different service running on a > > separate port. > > > > This does not map well to libnfs since it is centered around a "struct > > nfs_context". > > > > To use nfs_umount() from QEMU I would suggest : > > 1, make sure all commands in flight have finished, because you will > > soon disconnect from the NFS server and will never receive any > > in-flight responses. > > 2, unregister the nfs->fh filedescriptor from your eventsystem. > > Because the fd is about to be closed so there is great chance it will > > be recycled for a completely different purpose if you open any other > > files from qemu. > > > > 3, call nfs_umount() Internally this will close the socket to the > > NFS server, then go through thr process to open a new socket to the > > portmapper to discover the mount server, then close that socket and > > reconnect a new socket again to the mount server and perform the UMNT > > call. > > > What we currently do in Qemu is: > > > 1) bdrv_drain > > 2) bdrv_close which in the end calls nfs_client_close from block/nfs.c. > > There we call: > > 2a) nfs_close(client->fh) > > 2b) aio_set_fd_handler(NULL) > > 2c) nfs_destroy_context(client->context); > > > My first patch added a nfs_umount between 2a) and 2b) so that we have > > 2a) nfs_close(client->fh) > > 2b) nfs_umount(client->context) > > 2c) aio_set_fd_handler(NULL) > > 2d) nfs_destroy_context(client->context); > > > This leads to triggering to assertion for an uninitialized client->mutex which is called from an invocation > > of nfs_process_read after nfs_destroy_context was called. > > > If I change the order as following I see no more assertions: > > 2a) aio_set_fd_handler(NULL) > > 2b) nfs_close(client->fh) > > 2c) nfs_umount(client->context) > > 2d) nfs_destroy_context(client->context); That makes sense and looks correct to me. > > > I think we should have done this in the first place, because nfs_close (and nfs_umount) poll on the nfs_fd in parallel > > if we use the sync calls. > > > Peter > > >
Am 05.09.19 um 12:28 schrieb ronnie sahlberg: > On Thu, Sep 5, 2019 at 8:16 PM Peter Lieven <pl@kamp.de> wrote: >> Am 05.09.19 um 12:05 schrieb ronnie sahlberg: >>> On Thu, Sep 5, 2019 at 7:43 PM Peter Lieven <pl@kamp.de> wrote: >>>> Am 04.09.19 um 11:34 schrieb Kevin Wolf: >>>>> Am 03.09.2019 um 21:52 hat Peter Lieven geschrieben: >>>>>>> Am 03.09.2019 um 16:56 schrieb Kevin Wolf <kwolf@redhat.com>: >>>>>>> >>>>>>> Am 03.09.2019 um 15:44 hat Peter Lieven geschrieben: >>>>>>>> libnfs recently added support for unmounting. Add support >>>>>>>> in Qemu too. >>>>>>>> >>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>>>>>> Looks trivial enough to review even for me. :-) >>>>>>> >>>>>>> Thanks, applied to the block branch. >>>>>>> >>>>>>> Kevin >>>>>> I am not sure what the reason is, but with this patch I sometimes run >>>>>> into nfs_process_read being called for a cdrom mounted from nfs after >>>>>> I ejected it (and the whole nfs client context is already destroyed). >>>>> Does this mean that nfs_umount() gets some response, but we don't >>>>> properly wait for it? Or is some older request still in flight? >>>> nfs_umount itself is a sync call and should only terminate when >>>> >>>> the call is done. But there is an independent I/O handler in that >>>> >>>> function polling on the fd. (wait_for_nfs_reply in libnfs-sync.c). >>>> >>>> This is why I thought the right solution is to stop the Qemu I/O handler >>>> >>>> before calling nfs_close and nfs_umount. nfs_close also uses this >>>> >>>> sync I/O handler, but for some reason it seems not to make trouble. >>>> >>>> >>>> The other solution would be to use the async versions of close and umount, >>>> >>>> but that would make the code in Qemu more complex. >>>> >>>> >>> NFS umount is pretty messy so I think you should continue using the >>> sync version. >>> In NFSv3 (there is no mount protocol in v4) the Mount call (fetch the >>> root filehandle) >>> and the Umount calls (tell server we should no longer show up in >>> showexports -a output) >>> are not part of the NFS protocol but a different service running on a >>> separate port. >>> >>> This does not map well to libnfs since it is centered around a "struct >>> nfs_context". >>> >>> To use nfs_umount() from QEMU I would suggest : >>> 1, make sure all commands in flight have finished, because you will >>> soon disconnect from the NFS server and will never receive any >>> in-flight responses. >>> 2, unregister the nfs->fh filedescriptor from your eventsystem. >>> Because the fd is about to be closed so there is great chance it will >>> be recycled for a completely different purpose if you open any other >>> files from qemu. >>> >>> 3, call nfs_umount() Internally this will close the socket to the >>> NFS server, then go through thr process to open a new socket to the >>> portmapper to discover the mount server, then close that socket and >>> reconnect a new socket again to the mount server and perform the UMNT >>> call. >> >> What we currently do in Qemu is: >> >> >> 1) bdrv_drain >> >> 2) bdrv_close which in the end calls nfs_client_close from block/nfs.c. >> >> There we call: >> >> 2a) nfs_close(client->fh) >> >> 2b) aio_set_fd_handler(NULL) >> >> 2c) nfs_destroy_context(client->context); >> >> >> My first patch added a nfs_umount between 2a) and 2b) so that we have >> >> 2a) nfs_close(client->fh) >> >> 2b) nfs_umount(client->context) >> >> 2c) aio_set_fd_handler(NULL) >> >> 2d) nfs_destroy_context(client->context); >> >> >> This leads to triggering to assertion for an uninitialized client->mutex which is called from an invocation >> >> of nfs_process_read after nfs_destroy_context was called. >> >> >> If I change the order as following I see no more assertions: >> >> 2a) aio_set_fd_handler(NULL) >> >> 2b) nfs_close(client->fh) >> >> 2c) nfs_umount(client->context) >> >> 2d) nfs_destroy_context(client->context); > That makes sense and looks correct to me. I am unsure if we should acquire the client->mutex before we fiddle with the aio_handler because in theory there could be something happen to the nfs_fd in the background. Best, Peter
diff --git a/block/nfs.c b/block/nfs.c index 0ec50953e4..9d30963fd8 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -1,7 +1,7 @@ /* * QEMU Block driver for native access to files on NFS shares * - * Copyright (c) 2014-2017 Peter Lieven <pl@kamp.de> + * Copyright (c) 2014-2019 Peter Lieven <pl@kamp.de> * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal @@ -394,6 +394,9 @@ static void nfs_client_close(NFSClient *client) nfs_close(client->context, client->fh); client->fh = NULL; } +#ifdef LIBNFS_FEATURE_UMOUNT + nfs_umount(client->context); +#endif aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), false, NULL, NULL, NULL, NULL); nfs_destroy_context(client->context);
libnfs recently added support for unmounting. Add support in Qemu too. Signed-off-by: Peter Lieven <pl@kamp.de> --- block/nfs.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)