Message ID | 20240308-vfs-pidfd-b106369f5406@brauner (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GIT,PULL] vfs pidfd | expand |
The pull request you sent on Fri, 8 Mar 2024 11:13:50 +0100:
> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.9.pidfd
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b5683a37c881e2e08065f1670086e281430ee19f
Thank you!
On Fri, 8 Mar 2024 at 02:14, Christian Brauner <brauner@kernel.org> wrote: > > * Move pidfds from the anonymous inode infrastructure to a tiny > pseudo filesystem. This will unblock further work that we weren't able > to do simply because of the very justified limitations of anonymous > inodes. Moving pidfds to a tiny pseudo filesystem allows for statx on > pidfds to become useful for the first time. They can now be compared > by inode number which are unique for the system lifetime. So I obviously pulled this already, but I did have one question - we don't make nsfs conditional, and I'm not convinced we should make pidfs conditional either. I think (and *hope*) all the semantic annoyances got sorted out, and I don't think there are any realistic size advantages to not enabling CONFIG_FS_PID. Is there some fundamental reason for that config entry to exist? Linus
On Mon, Mar 11, 2024 at 01:05:06PM -0700, Linus Torvalds wrote: > On Fri, 8 Mar 2024 at 02:14, Christian Brauner <brauner@kernel.org> wrote: > > > > * Move pidfds from the anonymous inode infrastructure to a tiny > > pseudo filesystem. This will unblock further work that we weren't able > > to do simply because of the very justified limitations of anonymous > > inodes. Moving pidfds to a tiny pseudo filesystem allows for statx on > > pidfds to become useful for the first time. They can now be compared > > by inode number which are unique for the system lifetime. > > So I obviously pulled this already, but I did have one question - we > don't make nsfs conditional, and I'm not convinced we should make > pidfs conditional either. > > I think (and *hope*) all the semantic annoyances got sorted out, and I > don't think there are any realistic size advantages to not enabling > CONFIG_FS_PID. > > Is there some fundamental reason for that config entry to exist? No, the size of struct pid was the main reason but I don't think it matters. A side-effect was that we could easily enforce 64bit inode numbers. But realistically it's trivial enough to workaround. Here's a patch for what I think is pretty simple appended. Does that work?
On Tue, 12 Mar 2024 at 07:16, Christian Brauner <brauner@kernel.org> wrote: > > No, the size of struct pid was the main reason but I don't think it > matters. A side-effect was that we could easily enforce 64bit inode > numbers. But realistically it's trivial enough to workaround. Here's a > patch for what I think is pretty simple appended. Does that work? This looks eminently sane to me. Not that I actually _tested_it, but since my testing would have compared it to my current setup (64-bit and CONFIG_FS_PID=y) any testing would have been pointless because that case didn't change. Looking at the patch, I do wonder how much we even care about 64-bit inodes. I'd like to point out how 'path_from_stashed()' only takes a 'unsigned long ino' anyway, and I don't think anything really cares about either the high bits *or* the uniqueness of that inode number.. And similarly, i_ino isn't actually *used* for anything but naming to user space. So I'm not at all sure the whole 64-bit checks are worth it. Am I missing something else? Linus
On Tue, Mar 12, 2024 at 09:23:55AM -0700, Linus Torvalds wrote: > On Tue, 12 Mar 2024 at 07:16, Christian Brauner <brauner@kernel.org> wrote: > > > > No, the size of struct pid was the main reason but I don't think it > > matters. A side-effect was that we could easily enforce 64bit inode > > numbers. But realistically it's trivial enough to workaround. Here's a > > patch for what I think is pretty simple appended. Does that work? > > This looks eminently sane to me. Not that I actually _tested_it, but > since my testing would have compared it to my current setup (64-bit > and CONFIG_FS_PID=y) any testing would have been pointless because > that case didn't change. > > Looking at the patch, I do wonder how much we even care about 64-bit > inodes. I'd like to point out how 'path_from_stashed()' only takes a > 'unsigned long ino' anyway, and I don't think anything really cares > about either the high bits *or* the uniqueness of that inode number.. > > And similarly, i_ino isn't actually *used* for anything but naming to > user space. It's used to compare pidfs and someone actually already sent a pull request for this to another project iirc. So it'd be good to keep that property. But if your point is that we don't care about this for 32bit then I do agree. We could do away with the checks completely and just accept the truncation for 32bit. If that's your point feel free to just remove the 32bit handling in the patch and apply it. Let me know. Maybe I misunderstood. > > So I'm not at all sure the whole 64-bit checks are worth it. Am I > missing something else?
On Tue, 12 Mar 2024 at 13:09, Christian Brauner <brauner@kernel.org> wrote: > > It's used to compare pidfs and someone actually already sent a pull > request for this to another project iirc. So it'd be good to keep that > property. Hmm. If people really do care, I guess we should spend the effort on making those things unique. > But if your point is that we don't care about this for 32bit then I do > agree. We could do away with the checks completely and just accept the > truncation for 32bit. If that's your point feel free to just remove the > 32bit handling in the patch and apply it. Let me know. Maybe I > misunderstood. I personally don't care about 32-bit any more, but it also feels wrong to just say that it's ok depending on something on a 64-bit kernel, but not a 32-bit one. So let's go with your patch. It's not like it's a problem to spend the (very little) extra effort to do a 64-bit inode number. Linus
On Tue, Mar 12, 2024 at 01:21:34PM -0700, Linus Torvalds wrote: > On Tue, 12 Mar 2024 at 13:09, Christian Brauner <brauner@kernel.org> wrote: > > > > It's used to compare pidfs and someone actually already sent a pull > > request for this to another project iirc. So it'd be good to keep that > > property. > > Hmm. If people really do care, I guess we should spend the effort on > making those things unique. So, I cleaned that patch and took the chance to simplify things a bit more and now we give the same guarantees for 32bit and 64bit. We're still removing a lot more code than we add. I provided a detailed description and some already existing users. If you're fine with it I would ask you to please just apply it or if you prefer I can send a pull request tomorrow. I'm still stuck at the doctor's office. I spent most of my day compiling and test i386 kernel and userspace. Surprisingly, trauma isn't the best teacher because I had completely forgotten how horrible it all is and I'm glad I'm back in a world where 64bit is a thing.
On Wed, 13 Mar 2024 at 10:10, Christian Brauner <brauner@kernel.org> wrote: > > If you're fine with it I would ask you to please just apply it [..] I'll take it directly, no problem. Thanks, Linus