Message ID | 1394284964-12997-1-git-send-email-steved@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <steved@redhat.com> wrote: > Modern day kernel will no longer return all timeout > errors instead the process spins endlessly in the kernel. > This behavior will cause the foreground mount to hang, never > allowing the mount to go into background. > > So this patch eliminates the foreground mount cause > background mounts to go directly into background > > Signed-off-by: Steve Dickson <steved@redhat.com> Did NFS mounts *ever* time out (when 'soft' isn't given)? If so, there is probably a regression which maybe should be fixed. If not, then this has always been a bug since sting-options were introduced and the kernel started doing the mountd filehandle lookup... So I'm probably OK with the patch but I wonder if there is more of a story behind this that we should be sure we understand... Thanks, NeilBrown > --- > utils/mount/stropts.c | 31 ++++++++----------------------- > 1 files changed, 8 insertions(+), 23 deletions(-) > > diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c > index a642394..92a7245 100644 > --- a/utils/mount/stropts.c > +++ b/utils/mount/stropts.c > @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi) > } > > /* > - * Handle "background" NFS mount [first try] > - * > - * Returns a valid mount command exit code. > - * > - * EX_BG should cause the caller to fork and invoke nfsmount_child. > - */ > -static int nfsmount_parent(struct nfsmount_info *mi) > -{ > - if (nfs_try_mount(mi)) > - return EX_SUCCESS; > - > - /* retry background mounts when the server is not up */ > - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) { > - mount_error(mi->spec, mi->node, errno); > - return EX_FAIL; > - } > - > - sys_mount_errors(mi->hostname, errno, 1, 1); > - return EX_BG; > -} > - > -/* > * Handle "background" NFS mount [retry daemon] > * > * Returns a valid mount command exit code: EX_SUCCESS if successful, > @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi) > static int nfsmount_bg(struct nfsmount_info *mi) > { > if (!mi->child) > - return nfsmount_parent(mi); > + /* > + * Modern day kernels no longer return all > + * timeouts errors in all cases, instead > + * the process spins in the kernel, which > + * will hang a foreground mount. So background > + * mounts have to go directly into background > + */ > + return EX_BG; > else > return nfsmount_child(mi); > }
On Mon, 10 Mar 2014 10:18:16 +1100 NeilBrown <neilb@suse.de> wrote: > On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <steved@redhat.com> wrote: > > > Modern day kernel will no longer return all timeout > > errors instead the process spins endlessly in the kernel. > > This behavior will cause the foreground mount to hang, never > > allowing the mount to go into background. > > > > So this patch eliminates the foreground mount cause > > background mounts to go directly into background > > > > Signed-off-by: Steve Dickson <steved@redhat.com> > > Did NFS mounts *ever* time out (when 'soft' isn't given)? > > If so, there is probably a regression which maybe should be fixed. > > If not, then this has always been a bug since sting-options were introduced > and the kernel started doing the mountd filehandle lookup... > > So I'm probably OK with the patch but I wonder if there is more of a story > behind this that we should be sure we understand... > > Thanks, > NeilBrown Sorry, I really should read email in chronological order, shouldn't I :-) The foregoing discussion seemed to focus on NFSv4. Does NFSv3 behave the same way? A quick test suggests that it doesn't. mount -o bg,vers=3 server:/path /mnt backgrounds as it should after a few seconds. So I suspect this patch should be version dependent? However falling-back from v4 (which we leave entirely to the kernel) to v3 could be a bit awkward. I think that an NFSv4 mount really does need to timeout - whether that happens in the kernel or through the use of "alarm()" doesn't really bother me. But if it doesn't timeout, then it is violating the documentation and ignoring the retry= setting. NeilBrown
Hey Neil, On 03/09/2014 07:43 PM, NeilBrown wrote: > On Mon, 10 Mar 2014 10:18:16 +1100 NeilBrown <neilb@suse.de> wrote: > >> On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <steved@redhat.com> wrote: >> >>> Modern day kernel will no longer return all timeout >>> errors instead the process spins endlessly in the kernel. >>> This behavior will cause the foreground mount to hang, never >>> allowing the mount to go into background. >>> >>> So this patch eliminates the foreground mount cause >>> background mounts to go directly into background >>> >>> Signed-off-by: Steve Dickson <steved@redhat.com> >> >> Did NFS mounts *ever* time out (when 'soft' isn't given)? >> >> If so, there is probably a regression which maybe should be fixed. >> >> If not, then this has always been a bug since sting-options were introduced >> and the kernel started doing the mountd filehandle lookup... >> >> So I'm probably OK with the patch but I wonder if there is more of a story >> behind this that we should be sure we understand... >> >> Thanks, >> NeilBrown > > Sorry, I really should read email in chronological order, shouldn't I :-) > > The foregoing discussion seemed to focus on NFSv4. Does NFSv3 behave the > same way? A quick test suggests that it doesn't. > mount -o bg,vers=3 server:/path /mnt > backgrounds as it should after a few seconds. Right... After first RPC timeout the kernel returns ETIMEDOUT and the mount goes into background... This is what I was trying to simulate with my first set of patches.... > > So I suspect this patch should be version dependent? It could be, but I'm not sure its necessary. > > However falling-back from v4 (which we leave entirely to the kernel) to v3 > could be a bit awkward. Why? Being in foreground or background should not effect any kind of mount behavior. > > I think that an NFSv4 mount really does need to timeout - whether that > happens in the kernel or through the use of "alarm()" doesn't really bother > me. But if it doesn't timeout, then it is violating the documentation and > ignoring the retry= setting. Yes... retry= does talk about different timeouts on fg and bg mounts.... I guess if the mount goes directly into bg the the first time out is not tried... At the end of the day, too would prefer the kernel return timeouts like it once did... but unfortunately that is not going to happen. The new way is to spin kernel and hang the process... forever... which is unfortunate... IMHO... The only thing I against using alarm() is in the past NFS and signals have not always played nice together... Interrupting the kernel in critical parts of the code is always dicey at best... but maybe this is no loner case. One question I do have, does anybody know why the first foreground mount was even tried before going into background? Since bg mount are usually used in fstabs, going directly into background would definitely speed up mounts during booting... steved. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mar 8, 2014, at 8:22, Steve Dickson <SteveD@redhat.com> wrote: > Modern day kernel will no longer return all timeout > errors instead the process spins endlessly in the kernel. > This behavior will cause the foreground mount to hang, never > allowing the mount to go into background. > > So this patch eliminates the foreground mount cause > background mounts to go directly into background > > Signed-off-by: Steve Dickson <steved@redhat.com> > --- > utils/mount/stropts.c | 31 ++++++++----------------------- > 1 files changed, 8 insertions(+), 23 deletions(-) > > diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c > index a642394..92a7245 100644 > --- a/utils/mount/stropts.c > +++ b/utils/mount/stropts.c > @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi) > } > > /* > - * Handle "background" NFS mount [first try] > - * > - * Returns a valid mount command exit code. > - * > - * EX_BG should cause the caller to fork and invoke nfsmount_child. > - */ > -static int nfsmount_parent(struct nfsmount_info *mi) > -{ > - if (nfs_try_mount(mi)) > - return EX_SUCCESS; > - > - /* retry background mounts when the server is not up */ > - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) { > - mount_error(mi->spec, mi->node, errno); > - return EX_FAIL; > - } > - > - sys_mount_errors(mi->hostname, errno, 1, 1); > - return EX_BG; > -} > - > -/* > * Handle "background" NFS mount [retry daemon] > * > * Returns a valid mount command exit code: EX_SUCCESS if successful, > @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi) > static int nfsmount_bg(struct nfsmount_info *mi) > { > if (!mi->child) > - return nfsmount_parent(mi); > + /* > + * Modern day kernels no longer return all > + * timeouts errors in all cases, instead > + * the process spins in the kernel, which > + * will hang a foreground mount. So background > + * mounts have to go directly into background > + */ > + return EX_BG; > else > return nfsmount_child(mi); > } Hi Steve, Doesn’t this mean that ‘mount.nfs’ will no longer attempt to wait for the mount to complete? That’s why I suggested having the parent set a timer, and then waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating either that the child mount process is done mounting or that the timeout occurred). Cheers Trond
On 03/11/2014 05:13 PM, Trond Myklebust wrote: > > On Mar 8, 2014, at 8:22, Steve Dickson <SteveD@redhat.com> wrote: > >> Modern day kernel will no longer return all timeout >> errors instead the process spins endlessly in the kernel. >> This behavior will cause the foreground mount to hang, never >> allowing the mount to go into background. >> >> So this patch eliminates the foreground mount cause >> background mounts to go directly into background >> >> Signed-off-by: Steve Dickson <steved@redhat.com> >> --- >> utils/mount/stropts.c | 31 ++++++++----------------------- >> 1 files changed, 8 insertions(+), 23 deletions(-) >> >> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c >> index a642394..92a7245 100644 >> --- a/utils/mount/stropts.c >> +++ b/utils/mount/stropts.c >> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi) >> } >> >> /* >> - * Handle "background" NFS mount [first try] >> - * >> - * Returns a valid mount command exit code. >> - * >> - * EX_BG should cause the caller to fork and invoke nfsmount_child. >> - */ >> -static int nfsmount_parent(struct nfsmount_info *mi) >> -{ >> - if (nfs_try_mount(mi)) >> - return EX_SUCCESS; >> - >> - /* retry background mounts when the server is not up */ >> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) { >> - mount_error(mi->spec, mi->node, errno); >> - return EX_FAIL; >> - } >> - >> - sys_mount_errors(mi->hostname, errno, 1, 1); >> - return EX_BG; >> -} >> - >> -/* >> * Handle "background" NFS mount [retry daemon] >> * >> * Returns a valid mount command exit code: EX_SUCCESS if successful, >> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi) >> static int nfsmount_bg(struct nfsmount_info *mi) >> { >> if (!mi->child) >> - return nfsmount_parent(mi); >> + /* >> + * Modern day kernels no longer return all >> + * timeouts errors in all cases, instead >> + * the process spins in the kernel, which >> + * will hang a foreground mount. So background >> + * mounts have to go directly into background >> + */ >> + return EX_BG; >> else >> return nfsmount_child(mi); >> } > > Hi Steve, > > Doesn’t this mean that ‘mount.nfs’ will no longer attempt to wait > for the mount to complete? Yes. The foreground will no longer be tried.... > That’s why I suggested having the parent set a timer, and then > waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating > either that the child mount process is done mounting > or that the timeout occurred). Why wait? Kernels today no longer return on timeouts or ECONNREFUSED so instead of have the foreground mounts hang forever why not just let the background mounts hang, forever? steved. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mar 11, 2014, at 18:56, Steve Dickson <SteveD@redhat.com> wrote: > > > On 03/11/2014 05:13 PM, Trond Myklebust wrote: >> >> On Mar 8, 2014, at 8:22, Steve Dickson <SteveD@redhat.com> wrote: >> >>> Modern day kernel will no longer return all timeout >>> errors instead the process spins endlessly in the kernel. >>> This behavior will cause the foreground mount to hang, never >>> allowing the mount to go into background. >>> >>> So this patch eliminates the foreground mount cause >>> background mounts to go directly into background >>> >>> Signed-off-by: Steve Dickson <steved@redhat.com> >>> --- >>> utils/mount/stropts.c | 31 ++++++++----------------------- >>> 1 files changed, 8 insertions(+), 23 deletions(-) >>> >>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c >>> index a642394..92a7245 100644 >>> --- a/utils/mount/stropts.c >>> +++ b/utils/mount/stropts.c >>> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi) >>> } >>> >>> /* >>> - * Handle "background" NFS mount [first try] >>> - * >>> - * Returns a valid mount command exit code. >>> - * >>> - * EX_BG should cause the caller to fork and invoke nfsmount_child. >>> - */ >>> -static int nfsmount_parent(struct nfsmount_info *mi) >>> -{ >>> - if (nfs_try_mount(mi)) >>> - return EX_SUCCESS; >>> - >>> - /* retry background mounts when the server is not up */ >>> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) { >>> - mount_error(mi->spec, mi->node, errno); >>> - return EX_FAIL; >>> - } >>> - >>> - sys_mount_errors(mi->hostname, errno, 1, 1); >>> - return EX_BG; >>> -} >>> - >>> -/* >>> * Handle "background" NFS mount [retry daemon] >>> * >>> * Returns a valid mount command exit code: EX_SUCCESS if successful, >>> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi) >>> static int nfsmount_bg(struct nfsmount_info *mi) >>> { >>> if (!mi->child) >>> - return nfsmount_parent(mi); >>> + /* >>> + * Modern day kernels no longer return all >>> + * timeouts errors in all cases, instead >>> + * the process spins in the kernel, which >>> + * will hang a foreground mount. So background >>> + * mounts have to go directly into background >>> + */ >>> + return EX_BG; >>> else >>> return nfsmount_child(mi); >>> } >> >> Hi Steve, >> >> Doesn’t this mean that ‘mount.nfs’ will no longer attempt to wait >> for the mount to complete? > Yes. The foreground will no longer be tried.... > >> That’s why I suggested having the parent set a timer, and then >> waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating >> either that the child mount process is done mounting >> or that the timeout occurred). > Why wait? Kernels today no longer return on timeouts or ECONNREFUSED > so instead of have the foreground mounts hang forever why not > just let the background mounts hang, forever? That’s the point; we would let the backgrounded mount.nfs hang for as long as it takes. However the foreground ‘mount.nfs’ process would exit after the user-specified timeout. The reason for wanting to wait is so that the ‘bg’ entries in /etc/fstab are either known to be fully mounted, or timed out before we allow the ‘mount remote filesystems‘ init process to complete. If not, the bootup will be unpredictable, with nobody being able to rely on the mounts being present even in situations where an ‘fg’ mount would succeed instantly.
diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c index a642394..92a7245 100644 --- a/utils/mount/stropts.c +++ b/utils/mount/stropts.c @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi) } /* - * Handle "background" NFS mount [first try] - * - * Returns a valid mount command exit code. - * - * EX_BG should cause the caller to fork and invoke nfsmount_child. - */ -static int nfsmount_parent(struct nfsmount_info *mi) -{ - if (nfs_try_mount(mi)) - return EX_SUCCESS; - - /* retry background mounts when the server is not up */ - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) { - mount_error(mi->spec, mi->node, errno); - return EX_FAIL; - } - - sys_mount_errors(mi->hostname, errno, 1, 1); - return EX_BG; -} - -/* * Handle "background" NFS mount [retry daemon] * * Returns a valid mount command exit code: EX_SUCCESS if successful, @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi) static int nfsmount_bg(struct nfsmount_info *mi) { if (!mi->child) - return nfsmount_parent(mi); + /* + * Modern day kernels no longer return all + * timeouts errors in all cases, instead + * the process spins in the kernel, which + * will hang a foreground mount. So background + * mounts have to go directly into background + */ + return EX_BG; else return nfsmount_child(mi); }
Modern day kernel will no longer return all timeout errors instead the process spins endlessly in the kernel. This behavior will cause the foreground mount to hang, never allowing the mount to go into background. So this patch eliminates the foreground mount cause background mounts to go directly into background Signed-off-by: Steve Dickson <steved@redhat.com> --- utils/mount/stropts.c | 31 ++++++++----------------------- 1 files changed, 8 insertions(+), 23 deletions(-)