Message ID | 20201016124550.10739-1-sargun@sargun.me (mailing list archive) |
---|---|
Headers | show |
Series | NFS User Namespaces with new mount API | expand |
On Fri, Oct 16, 2020 at 05:45:47AM -0700, Sargun Dhillon wrote: > This patchset adds some functionality to allow NFS to be used from > NFS namespaces (containers). > > Changes since v1: > * Added samples > > Sargun Dhillon (3): > NFS: Use cred from fscontext during fsmount > samples/vfs: Split out common code for new syscall APIs > samples/vfs: Add example leveraging NFS with new APIs and user > namespaces > > fs/nfs/client.c | 2 +- > fs/nfs/flexfilelayout/flexfilelayout.c | 1 + > fs/nfs/nfs4client.c | 2 +- > samples/vfs/.gitignore | 2 + > samples/vfs/Makefile | 5 +- > samples/vfs/test-fsmount.c | 86 +----------- > samples/vfs/test-nfs-userns.c | 181 +++++++++++++++++++++++++ > samples/vfs/vfs-helper.c | 43 ++++++ > samples/vfs/vfs-helper.h | 55 ++++++++ > 9 files changed, 289 insertions(+), 88 deletions(-) > create mode 100644 samples/vfs/test-nfs-userns.c > create mode 100644 samples/vfs/vfs-helper.c > create mode 100644 samples/vfs/vfs-helper.h > > -- > 2.25.1 > Digging deeper into this a little bit, I actually found that there is some problematic aspects of the current behaviour. Because nfs_get_tree_common calls sget_fc, and sget_fc sets the super block's s_user_ns (via alloc_super) to the fs_context's user namespace unless the global flag is set (which NFS does not set), there are a bunch of permissions checks that are done against the super block's user_ns. It looks like this was introduced in: f2aedb713c28: NFS: Add fs_context support[1] It turns out that unmapped users in the "parent" user namespace just get an EOVERFLOW error when trying to perform a read, even if the UID sent to the NFS server to read a file is a valid uid (the uid in the init user ns), and inode_permission checks permissions against the mapped UID in the namespace, while the authentication credentials (UIDs, GIDs) sent to the server are those from the init user ns. [This is all under the assumption there's not upcalls doing ID mapping] Although, I do not think this presents any security risk (because you have to have CAP_SYS_ADMIN in the init user ns to get this far), it definitely seems like "incorrect" behaviour. [1]: https://lore.kernel.org/linux-nfs/20191120152750.6880-26-smayhew@redhat.com/