Message ID | 1446543679-28849-1-git-send-email-linux@rasmusvillemoes.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Nov 3, 2015 at 1:41 AM, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > I'm sure I've missed something, hence the RFC. But if not, there's > probably also a few memsets which become redundant. And the > __set_close_on_exec part should probably be its own patch... The patch looks fine to me. I'm not sure the __set_close_on_exec part even makes sense, because if you set that bit, it usually really *is* clear before, so testing it beforehand is just pointless. And if somebody really keeps setting the bit, they are doing something stupid anyway.. So I have nothing against the patch, but I do wonder how much it matters. If there isn't a noticeable performance win, I'd almost rather just keep the close-on-exec bitmap up-to-date. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 03 2015, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Nov 3, 2015 at 1:41 AM, Rasmus Villemoes > <linux@rasmusvillemoes.dk> wrote: >> >> I'm sure I've missed something, hence the RFC. But if not, there's >> probably also a few memsets which become redundant. And the >> __set_close_on_exec part should probably be its own patch... > > The patch looks fine to me. I'm not sure the __set_close_on_exec part > even makes sense, because if you set that bit, it usually really *is* > clear before, so testing it beforehand is just pointless. And if > somebody really keeps setting the bit, they are doing something stupid > anyway.. So that's true for the lifetime of a single fd where no-one of course does fcntl(fd, FD_CLOEXEC) more than once. But the scenario I was thinking of was when fds get recycled. open(, O_CLOEXEC) => 5, close(5), open(, O_CLOEXEC) => 5; in that case, letting the close_on_exec bit keep its value avoids dirtying the cache line on all subsequent allocations of fd 5 (for example, had Eric's app been using *_CLOEXEC for all its open's, socket's etc. there wouldn't have been any gain by adding the conditional to __clear_close_on_exec, but I'd expect to see a similar gain by doing the symmetric thing). Again, this is assuming that almost all fd allocations either do or do not apply CLOEXEC - after a while, ->close_on_exec would reach a steady-state where no bits get flipped anymore. The "usually really *is* clear" only holds when we do "bother clearing close_on_exec bit for unused fds", which is what I suggest we don't :-) I don't think either state of the bit in close_on_exec is more or less 'up-to-date' when its buddy in open_fds is not set. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-11-03 at 10:41 +0100, Rasmus Villemoes wrote: > @@ -667,7 +667,7 @@ void do_close_on_exec(struct files_struct *files) > fdt = files_fdtable(files); > if (fd >= fdt->max_fds) > break; > - set = fdt->close_on_exec[i]; > + set = fdt->close_on_exec[i] & fdt->open_fds[i]; > if (!set) > continue; > fdt->close_on_exec[i] = 0; If you don't bother, why leaving this final fdt->close_on_exec[i] = 0 ? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/file.c b/fs/file.c index c6986dce0334..93cfbcd450c3 100644 --- a/fs/file.c +++ b/fs/file.c @@ -231,7 +231,8 @@ repeat: static inline void __set_close_on_exec(int fd, struct fdtable *fdt) { - __set_bit(fd, fdt->close_on_exec); + if (!test_bit(fd, fdt->close_on_exec)) + __set_bit(fd, fdt->close_on_exec); } static inline void __clear_close_on_exec(int fd, struct fdtable *fdt) @@ -644,7 +645,6 @@ int __close_fd(struct files_struct *files, unsigned fd) if (!file) goto out_unlock; rcu_assign_pointer(fdt->fd[fd], NULL); - __clear_close_on_exec(fd, fdt); __put_unused_fd(files, fd); spin_unlock(&files->file_lock); return filp_close(file, files); @@ -667,7 +667,7 @@ void do_close_on_exec(struct files_struct *files) fdt = files_fdtable(files); if (fd >= fdt->max_fds) break; - set = fdt->close_on_exec[i]; + set = fdt->close_on_exec[i] & fdt->open_fds[i]; if (!set) continue; fdt->close_on_exec[i] = 0;
In fc90888d07b8 (vfs: conditionally clear close-on-exec flag) a conditional was added to __clear_close_on_exec to avoid dirtying a cache line in the common case where the bit is already clear. However, AFAICT, we don't rely on the close_on_exec bit being clear for unused fds, except as an optimization in do_close_on_exec(); if I haven't missed anything, __{set,clear}_close_on_exec is always called when a new fd is allocated. At the expense of also reading through ->open_fds in do_close_on_exec(), we can avoid accessing the close_on_exec bitmap altogether in close(), which I think is a reasonable trade-off. The conditional added in the commit above still makes sense to avoid the dirtying on the allocation paths, but I also think it might make sense in __set_close_on_exec: I suppose any given app handling a non-trivial amount of fds uses O_CLOEXEC for either almost none or almost all of them. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- I'm sure I've missed something, hence the RFC. But if not, there's probably also a few memsets which become redundant. And the __set_close_on_exec part should probably be its own patch... fs/file.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)