Message ID | 20200714161415.3886463-1-christian.brauner@ubuntu.com (mailing list archive) |
---|---|
Headers | show |
Series | fs: add mount_setattr() | expand |
On Tue, Jul 14, 2020 at 06:14:11PM +0200, Christian Brauner wrote: > mount_setattr() can be expected to grow over time and is designed with > extensibility in mind. It follows the extensible syscall pattern we have > used with other syscalls such as openat2(), clone3(), > sched_{set,get}attr(), and others. I.e. it's a likely crap insertion vector; any patches around that thing will require the same level of review as addition of a brand new syscall. And they will be harder to spot - consider the likely subjects for such patches and compare to open addition of a new syscall...
On Sun, Jul 19, 2020 at 06:10:54PM +0100, Al Viro wrote: > On Tue, Jul 14, 2020 at 06:14:11PM +0200, Christian Brauner wrote: > > > mount_setattr() can be expected to grow over time and is designed with > > extensibility in mind. It follows the extensible syscall pattern we have > > used with other syscalls such as openat2(), clone3(), > > sched_{set,get}attr(), and others. > > I.e. it's a likely crap insertion vector; any patches around that thing > will require the same level of review as addition of a brand new syscall. Which is just how we should and hopefully treat any meaningful extension, yes. Otherwise let's just never add a flag argument to any syscall and only have dup()- and accept()-like syscalls. > And they will be harder to spot - consider the likely subjects for such > patches and compare to open addition of a new syscall... In the new revision I have dropped the atime argument because David already plumbed setting atime into fsmount() via flags and making userspace jump through more hoops by adding more constants seems pointless. So the additional arguments can be moved up because we're below the 6 syscall args limit. Though I really want to stress that while I see the worry it is less a technial argument but one for our review process where we should treat extensions to syscalls as strict as syscall additions. Which yes, very much so. Thanks! Christian
> > mount_setattr() can be expected to grow over time and is designed with > extensibility in mind. It follows the extensible syscall pattern we have > used with other syscalls such as openat2(), clone3(), > sched_{set,get}attr(), and others. > The set of mount options is passed in the uapi struct mount_attr which > currently has the following layout: > > struct mount_attr { > __u64 attr_set; > __u64 attr_clr; > __u32 propagation; > __u32 atime; > }; > We probably can rework "mnt: allow to add a mount into an existing group" (MS_SET_GROUP https://lkml.org/lkml/2017/1/23/712) to an extension mount_setattr. Do anyone have any objections? We need it in CRIU because it is a big problem to restore complex mount trees with complex propagation flags (see my LPC talk for details https://linuxplumbersconf.org/event/7/contributions/640/) of system-containers. If we allow set(copy) sharing options we can separate mount tree restore and propagation restore and everything becomes much simpler. (And we already have CRIU implementation based on it, which helps with variety of bugs with mounts we previously had. https://src.openvz.org/projects/OVZ/repos/criu/browse/criu/mount-v2.c#880) I've also tried to consider another approach https://github.com/Snorch/linux/commit/84886f588527b062993ec3e9760c879163852518 to disable actual propagation while restoring the mount tree. But with this approach it looks like it would be still hard to restore in CRIU because: With "MS_SET_GROUP" we don't care about roots. Imagine we have mount A (shared_id=1, root="/some/sub/path") and mount B (master_id=1, root="/some/other/sub/path", with MS_SET_GROUP we can copy sharing from mount A to B and make B MS_SLAVE to restore them. In case we want to do the same with only inheritance we would have to have some helper mount in this share C (shared_id=1, root="/") so that B can inherit this share...