Message ID | 20210916084045.31684-2-hreitz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtiofsd: Allow using file handles instead of O_PATH FDs | expand |
On Thu, Sep 16, 2021 at 10:40:34AM +0200, Hanna Reitz wrote: > File handles are specific to mounts, and so name_to_handle_at() returns > the respective mount ID. However, open_by_handle_at() is not content > with an ID, it wants a file descriptor for some inode on the mount, > which we have to open. > > We want to use /proc/self/mountinfo to find the mounts' root directories > so we can open them and pass the respective FDs to open_by_handle_at(). > (We need to use the root directory, because we want the inode belonging > to every mount FD be deletable. Before the root directory can be > deleted, all entries within must have been closed, and so when it is > deleted, there should not be any file handles left that need its FD as > their mount FD. Thus, we can then close that FD and the inode can be > deleted.[1]) > > That is why we need to open /proc/self/mountinfo so that we can use it > to translate mount IDs into root directory paths. We have to open it > after setup_mounts() was called, because if we try to open it before, it > will appear as an empty file after setup_mounts(). > > [1] Note that in practice, you still cannot delete the mount root > directory. It is a mount point on the host, after all, and mount points > cannot be deleted. But by using the mount point as the mount FD, we > will at least not hog any actually deletable inodes. > > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > --- > tools/virtiofsd/passthrough_ll.c | 40 ++++++++++++++++++++++++++++++++ > 1 file changed, 40 insertions(+) > > diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c > index 38b2af8599..6511a6acb4 100644 > --- a/tools/virtiofsd/passthrough_ll.c > +++ b/tools/virtiofsd/passthrough_ll.c > @@ -172,6 +172,8 @@ struct lo_data { > > /* An O_PATH file descriptor to /proc/self/fd/ */ > int proc_self_fd; > + /* A read-only FILE pointer for /proc/self/mountinfo */ > + FILE *mountinfo_fp; > int user_killpriv_v2, killpriv_v2; > /* If set, virtiofsd is responsible for setting umask during creation */ > bool change_umask; > @@ -3718,6 +3720,19 @@ static void setup_chroot(struct lo_data *lo) > static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, > bool enable_syslog) > { > + int proc_self, mountinfo_fd; > + int saverr; > + > + /* > + * Open /proc/self before we pivot to the new root so we can still > + * open /proc/self/mountinfo afterwards > + */ > + proc_self = open("/proc/self", O_PATH); > + if (proc_self < 0) { > + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self: %m; " > + "will not be able to use file handles\n"); > + } > + Hi Hanna, Should we open /proc/self and /proc/self/mountinfo only if user wants to file handle. We have already parsed options by now so we know. Also, if user asked for file handles, and we can't open /proc/self or /proc/self/mountinfo successfully, I would think we should error out and not continue (instead of just log it and continue). That seems to be general theme. If user asked for a feature and if we can't enable it, we error out and let user retry without that particular feature. > if (lo->sandbox == SANDBOX_NAMESPACE) { > setup_namespaces(lo, se); > setup_mounts(lo->source); > @@ -3725,6 +3740,31 @@ static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, > setup_chroot(lo); > } > > + /* > + * Opening /proc/self/mountinfo before the umount2() call in > + * setup_mounts() leads to the file appearing empty. That is why > + * we defer opening it until here. > + */ > + lo->mountinfo_fp = NULL; > + if (proc_self >= 0) { > + mountinfo_fd = openat(proc_self, "mountinfo", O_RDONLY); > + if (mountinfo_fd < 0) { > + saverr = errno; > + } else if (mountinfo_fd >= 0) { > + lo->mountinfo_fp = fdopen(mountinfo_fd, "r"); > + if (!lo->mountinfo_fp) { > + saverr = errno; > + close(mountinfo_fd); > + } > + } > + if (!lo->mountinfo_fp) { > + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self/mountinfo: " > + "%s; will not be able to use file handles\n", > + strerror(saverr)); > + } > + close(proc_self); > + } > + Above code couple probably be moved in a helper function. Makes it easier to read setup_sandbox(). Same here, open mountinfo only if user wants file handle support and error out if file handle support can't be enabled. Thanks Vivek > setup_seccomp(enable_syslog); > setup_capabilities(g_strdup(lo->modcaps)); > } > -- > 2.31.1 >
On 18.10.21 19:07, Vivek Goyal wrote: > On Thu, Sep 16, 2021 at 10:40:34AM +0200, Hanna Reitz wrote: >> File handles are specific to mounts, and so name_to_handle_at() returns >> the respective mount ID. However, open_by_handle_at() is not content >> with an ID, it wants a file descriptor for some inode on the mount, >> which we have to open. >> >> We want to use /proc/self/mountinfo to find the mounts' root directories >> so we can open them and pass the respective FDs to open_by_handle_at(). >> (We need to use the root directory, because we want the inode belonging >> to every mount FD be deletable. Before the root directory can be >> deleted, all entries within must have been closed, and so when it is >> deleted, there should not be any file handles left that need its FD as >> their mount FD. Thus, we can then close that FD and the inode can be >> deleted.[1]) >> >> That is why we need to open /proc/self/mountinfo so that we can use it >> to translate mount IDs into root directory paths. We have to open it >> after setup_mounts() was called, because if we try to open it before, it >> will appear as an empty file after setup_mounts(). >> >> [1] Note that in practice, you still cannot delete the mount root >> directory. It is a mount point on the host, after all, and mount points >> cannot be deleted. But by using the mount point as the mount FD, we >> will at least not hog any actually deletable inodes. >> >> Signed-off-by: Hanna Reitz <hreitz@redhat.com> >> --- >> tools/virtiofsd/passthrough_ll.c | 40 ++++++++++++++++++++++++++++++++ >> 1 file changed, 40 insertions(+) >> >> diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c >> index 38b2af8599..6511a6acb4 100644 >> --- a/tools/virtiofsd/passthrough_ll.c >> +++ b/tools/virtiofsd/passthrough_ll.c >> @@ -172,6 +172,8 @@ struct lo_data { >> >> /* An O_PATH file descriptor to /proc/self/fd/ */ >> int proc_self_fd; >> + /* A read-only FILE pointer for /proc/self/mountinfo */ >> + FILE *mountinfo_fp; >> int user_killpriv_v2, killpriv_v2; >> /* If set, virtiofsd is responsible for setting umask during creation */ >> bool change_umask; >> @@ -3718,6 +3720,19 @@ static void setup_chroot(struct lo_data *lo) >> static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, >> bool enable_syslog) >> { >> + int proc_self, mountinfo_fd; >> + int saverr; >> + >> + /* >> + * Open /proc/self before we pivot to the new root so we can still >> + * open /proc/self/mountinfo afterwards >> + */ >> + proc_self = open("/proc/self", O_PATH); >> + if (proc_self < 0) { >> + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self: %m; " >> + "will not be able to use file handles\n"); >> + } >> + > Hi Hanna, > > Should we open /proc/self and /proc/self/mountinfo only if user wants > to file handle. We have already parsed options by now so we know. I didn’t think it would matter given that it wouldn’t have an adverse effect. If we can’t open them (and I can’t imagine a case where we’d be unable to open them), the only result is a warning. > Also, if user asked for file handles, and we can't open /proc/self or > /proc/self/mountinfo successfully, I would think we should error out > and not continue (instead of just log it and continue). Well, that would break the assumption I had above. Not that that’s really relevant, I just want to mention it. File handles are a best effort in any case. If they don’t work, we always fall back. So I don’t know whether we must error out. OTOH if we know they can never work, then perhaps it would be more sensible to error out. FWIW I’ve ported the relevant v1..v4 changes to virtiofsd-rs, and there it errors out. The error is unconditional, though, so even if you don’t request file handles, it’ll try to open mountinfo and exit on error. I found that reasonable because I can’t imagine a case where opening /proc/self/fd would work, but /proc/self/mountinfo wouldn’t – and working around that would be a bit cumbersome (it would mean wrapping PassthroughFs.mount_fds in an Option<> and .unwrap()-ing it on every use, with a comment why that’s fine). Honestly, I’d prefer to wait until we get a bug report about a failure to open /proc/self/mountinfo. > That seems to be general theme. If user asked for a feature and if > we can't enable it, we error out and let user retry without that > particular feature. > >> if (lo->sandbox == SANDBOX_NAMESPACE) { >> setup_namespaces(lo, se); >> setup_mounts(lo->source); >> @@ -3725,6 +3740,31 @@ static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, >> setup_chroot(lo); >> } >> >> + /* >> + * Opening /proc/self/mountinfo before the umount2() call in >> + * setup_mounts() leads to the file appearing empty. That is why >> + * we defer opening it until here. >> + */ >> + lo->mountinfo_fp = NULL; >> + if (proc_self >= 0) { >> + mountinfo_fd = openat(proc_self, "mountinfo", O_RDONLY); >> + if (mountinfo_fd < 0) { >> + saverr = errno; >> + } else if (mountinfo_fd >= 0) { >> + lo->mountinfo_fp = fdopen(mountinfo_fd, "r"); >> + if (!lo->mountinfo_fp) { >> + saverr = errno; >> + close(mountinfo_fd); >> + } >> + } >> + if (!lo->mountinfo_fp) { >> + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self/mountinfo: " >> + "%s; will not be able to use file handles\n", >> + strerror(saverr)); >> + } >> + close(proc_self); >> + } >> + > Above code couple probably be moved in a helper function. Makes it > easier to read setup_sandbox(). Same here, open mountinfo only if > user wants file handle support and error out if file handle support > can't be enabled. Perhaps, but frankly I don’t see a need to keep setup_sandbox() readable. AFAIU, we are planning to deprecate C virtiofsd anyway, so while it pains me to say something like this, we don’t need to keep it maintainable. Now that I’ve opened an MR to bring the v1..v4 changes to virtiofsd-rs (https://gitlab.com/virtio-fs/virtiofsd-rs/-/merge_requests/41), I also don’t really see a justification for putting further development effort into bringing file handles to C virtiofsd. Of course I’m still grateful for your review, and I’ll try to adapt it to virtiofsd-rs, but right now I don’t plan on sending a v5. Hanna
On Wed, Oct 20, 2021 at 11:04:31AM +0200, Hanna Reitz wrote: > On 18.10.21 19:07, Vivek Goyal wrote: > > On Thu, Sep 16, 2021 at 10:40:34AM +0200, Hanna Reitz wrote: > > > File handles are specific to mounts, and so name_to_handle_at() returns > > > the respective mount ID. However, open_by_handle_at() is not content > > > with an ID, it wants a file descriptor for some inode on the mount, > > > which we have to open. > > > > > > We want to use /proc/self/mountinfo to find the mounts' root directories > > > so we can open them and pass the respective FDs to open_by_handle_at(). > > > (We need to use the root directory, because we want the inode belonging > > > to every mount FD be deletable. Before the root directory can be > > > deleted, all entries within must have been closed, and so when it is > > > deleted, there should not be any file handles left that need its FD as > > > their mount FD. Thus, we can then close that FD and the inode can be > > > deleted.[1]) > > > > > > That is why we need to open /proc/self/mountinfo so that we can use it > > > to translate mount IDs into root directory paths. We have to open it > > > after setup_mounts() was called, because if we try to open it before, it > > > will appear as an empty file after setup_mounts(). > > > > > > [1] Note that in practice, you still cannot delete the mount root > > > directory. It is a mount point on the host, after all, and mount points > > > cannot be deleted. But by using the mount point as the mount FD, we > > > will at least not hog any actually deletable inodes. > > > > > > Signed-off-by: Hanna Reitz <hreitz@redhat.com> > > > --- > > > tools/virtiofsd/passthrough_ll.c | 40 ++++++++++++++++++++++++++++++++ > > > 1 file changed, 40 insertions(+) > > > > > > diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c > > > index 38b2af8599..6511a6acb4 100644 > > > --- a/tools/virtiofsd/passthrough_ll.c > > > +++ b/tools/virtiofsd/passthrough_ll.c > > > @@ -172,6 +172,8 @@ struct lo_data { > > > /* An O_PATH file descriptor to /proc/self/fd/ */ > > > int proc_self_fd; > > > + /* A read-only FILE pointer for /proc/self/mountinfo */ > > > + FILE *mountinfo_fp; > > > int user_killpriv_v2, killpriv_v2; > > > /* If set, virtiofsd is responsible for setting umask during creation */ > > > bool change_umask; > > > @@ -3718,6 +3720,19 @@ static void setup_chroot(struct lo_data *lo) > > > static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, > > > bool enable_syslog) > > > { > > > + int proc_self, mountinfo_fd; > > > + int saverr; > > > + > > > + /* > > > + * Open /proc/self before we pivot to the new root so we can still > > > + * open /proc/self/mountinfo afterwards > > > + */ > > > + proc_self = open("/proc/self", O_PATH); > > > + if (proc_self < 0) { > > > + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self: %m; " > > > + "will not be able to use file handles\n"); > > > + } > > > + > > Hi Hanna, > > > > Should we open /proc/self and /proc/self/mountinfo only if user wants > > to file handle. We have already parsed options by now so we know. > > I didn’t think it would matter given that it wouldn’t have an adverse > effect. If we can’t open them (and I can’t imagine a case where we’d be > unable to open them), the only result is a warning. > > > Also, if user asked for file handles, and we can't open /proc/self or > > /proc/self/mountinfo successfully, I would think we should error out > > and not continue (instead of just log it and continue). > > Well, that would break the assumption I had above. Not that that’s really > relevant, I just want to mention it. > > File handles are a best effort in any case. If they don’t work, we always > fall back. So I don’t know whether we must error out. > > OTOH if we know they can never work, then perhaps it would be more sensible > to error out. Yes. If they can't work because filesystem does not have capability or we don't have CAP_DAC_READ_SEARCH or any other necessary component is not there then we should fail. > > FWIW I’ve ported the relevant v1..v4 changes to virtiofsd-rs, and there it > errors out. The error is unconditional, though, so even if you don’t > request file handles, it’ll try to open mountinfo and exit on error. I > found that reasonable because I can’t imagine a case where opening > /proc/self/fd would work, but /proc/self/mountinfo wouldn’t – and working > around that would be a bit cumbersome (it would mean wrapping > PassthroughFs.mount_fds in an Option<> and .unwrap()-ing it on every use, > with a comment why that’s fine). Honestly, I’d prefer to wait until we get a > bug report about a failure to open /proc/self/mountinfo. > > > That seems to be general theme. If user asked for a feature and if > > we can't enable it, we error out and let user retry without that > > particular feature. > > > > > if (lo->sandbox == SANDBOX_NAMESPACE) { > > > setup_namespaces(lo, se); > > > setup_mounts(lo->source); > > > @@ -3725,6 +3740,31 @@ static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, > > > setup_chroot(lo); > > > } > > > + /* > > > + * Opening /proc/self/mountinfo before the umount2() call in > > > + * setup_mounts() leads to the file appearing empty. That is why > > > + * we defer opening it until here. > > > + */ > > > + lo->mountinfo_fp = NULL; > > > + if (proc_self >= 0) { > > > + mountinfo_fd = openat(proc_self, "mountinfo", O_RDONLY); > > > + if (mountinfo_fd < 0) { > > > + saverr = errno; > > > + } else if (mountinfo_fd >= 0) { > > > + lo->mountinfo_fp = fdopen(mountinfo_fd, "r"); > > > + if (!lo->mountinfo_fp) { > > > + saverr = errno; > > > + close(mountinfo_fd); > > > + } > > > + } > > > + if (!lo->mountinfo_fp) { > > > + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self/mountinfo: " > > > + "%s; will not be able to use file handles\n", > > > + strerror(saverr)); > > > + } > > > + close(proc_self); > > > + } > > > + > > Above code couple probably be moved in a helper function. Makes it > > easier to read setup_sandbox(). Same here, open mountinfo only if > > user wants file handle support and error out if file handle support > > can't be enabled. > > Perhaps, but frankly I don’t see a need to keep setup_sandbox() readable. > AFAIU, we are planning to deprecate C virtiofsd anyway, so while it pains me > to say something like this, we don’t need to keep it maintainable. > > Now that I’ve opened an MR to bring the v1..v4 changes to virtiofsd-rs > (https://gitlab.com/virtio-fs/virtiofsd-rs/-/merge_requests/41), I also > don’t really see a justification for putting further development effort into > bringing file handles to C virtiofsd. Of course I’m still grateful for your > review, and I’ll try to adapt it to virtiofsd-rs, but right now I don’t plan > on sending a v5. Fair enough. If file handle stuff is important for C version, I am hoping somebody else can pick this work. Anyway I think its almost 90% ready, we are just dealing with last 10% of issues. Vivek
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index 38b2af8599..6511a6acb4 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -172,6 +172,8 @@ struct lo_data { /* An O_PATH file descriptor to /proc/self/fd/ */ int proc_self_fd; + /* A read-only FILE pointer for /proc/self/mountinfo */ + FILE *mountinfo_fp; int user_killpriv_v2, killpriv_v2; /* If set, virtiofsd is responsible for setting umask during creation */ bool change_umask; @@ -3718,6 +3720,19 @@ static void setup_chroot(struct lo_data *lo) static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, bool enable_syslog) { + int proc_self, mountinfo_fd; + int saverr; + + /* + * Open /proc/self before we pivot to the new root so we can still + * open /proc/self/mountinfo afterwards + */ + proc_self = open("/proc/self", O_PATH); + if (proc_self < 0) { + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self: %m; " + "will not be able to use file handles\n"); + } + if (lo->sandbox == SANDBOX_NAMESPACE) { setup_namespaces(lo, se); setup_mounts(lo->source); @@ -3725,6 +3740,31 @@ static void setup_sandbox(struct lo_data *lo, struct fuse_session *se, setup_chroot(lo); } + /* + * Opening /proc/self/mountinfo before the umount2() call in + * setup_mounts() leads to the file appearing empty. That is why + * we defer opening it until here. + */ + lo->mountinfo_fp = NULL; + if (proc_self >= 0) { + mountinfo_fd = openat(proc_self, "mountinfo", O_RDONLY); + if (mountinfo_fd < 0) { + saverr = errno; + } else if (mountinfo_fd >= 0) { + lo->mountinfo_fp = fdopen(mountinfo_fd, "r"); + if (!lo->mountinfo_fp) { + saverr = errno; + close(mountinfo_fd); + } + } + if (!lo->mountinfo_fp) { + fuse_log(FUSE_LOG_WARNING, "Failed to open /proc/self/mountinfo: " + "%s; will not be able to use file handles\n", + strerror(saverr)); + } + close(proc_self); + } + setup_seccomp(enable_syslog); setup_capabilities(g_strdup(lo->modcaps)); }
File handles are specific to mounts, and so name_to_handle_at() returns the respective mount ID. However, open_by_handle_at() is not content with an ID, it wants a file descriptor for some inode on the mount, which we have to open. We want to use /proc/self/mountinfo to find the mounts' root directories so we can open them and pass the respective FDs to open_by_handle_at(). (We need to use the root directory, because we want the inode belonging to every mount FD be deletable. Before the root directory can be deleted, all entries within must have been closed, and so when it is deleted, there should not be any file handles left that need its FD as their mount FD. Thus, we can then close that FD and the inode can be deleted.[1]) That is why we need to open /proc/self/mountinfo so that we can use it to translate mount IDs into root directory paths. We have to open it after setup_mounts() was called, because if we try to open it before, it will appear as an empty file after setup_mounts(). [1] Note that in practice, you still cannot delete the mount root directory. It is a mount point on the host, after all, and mount points cannot be deleted. But by using the mount point as the mount FD, we will at least not hog any actually deletable inodes. Signed-off-by: Hanna Reitz <hreitz@redhat.com> --- tools/virtiofsd/passthrough_ll.c | 40 ++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+)